NPS ARCHIVE 
1969 



WERTHER, M. 


RECURSIVE ALGORITHM FOR THE BEST APPROXIMATE 
SOLUTION OF LENEAR EQUATIONS WITH APPLICATION' 
TO SYSTEM IDENTIFICATION AND STATE ESTIMATIOh 




by 




Manfred Herman Fritz Werther 




DUDLEY KNOX LIBRARY 
NAVAL POSTGRADUATE SCHOOL 
MONTEREY, CA 93943-5101 



United States 
Naval Postgraduate School 




THESIS 



RECURSIVE ALGORITHM FOR THE BEST APPROXIMATE 
SOLUTION OF LINEAR EQUATIONS WITH APPLICATIONS 
TO SYSTEM IDENTIFICATION AND STATE ESTIMATION 



by 



Manfred Hermann Fritz Werther 



October 1969 



Tkci document ha/> been qppnovzd Ion public nz- 
lzae>z and tale.) i£i> diutAibvution untmctzd. 



l/J.'bran 
\J . S • a 

Uonterey . 



Calif° rnia 



Sctioofi 

93940 y 



SChc 
01 

Recursive Algorithm for the Best Approximate 
Solution of Linear Equations with Applications 
to System Identification and State Estimation 

by 

Manfred Hermann Fritz Jtferther 
Lieutenant Commander, Federal German Navy 



rus | GRADUATE 
f ' ^iEREY, CA 93943-51 



Submitted in partial fulfillment of the 
requirements for the degree of 

DOCTOR OF PHILOSOPHY 

from the 

NAVAL POSTGRADUATE SCHOOL 
October 1969 



UJ£V& ^ ■ 



ABSTRACT 



The sequential solution, in recursive form, of a 
growing set of linear equations, based upon the least- 
square-error and a weighted least-square-error criterion, 
are developed. For comparison these results are applied 
to the discrete-time solution of several estimation and 
identification problems. Recursive algorithms for pseudo 
inversion and the best approximate solution of a set of 
linear equations are included. Finally, efficient state 
estimation procedures for time-invariant systems using a 
sliding-window observer are presented. 



2 



.l’brary 
I . S . Naval 

Jonterey , 



Postgraduate 

Cal if via. a 



School 

9dy4Q 



TABLE OF CONTENTS 



I. INTRODUCTION 5 

II. THE NORMALIZED LEAST-SQUARE-ERROR SOLUTION 13 

A. LEAST-SQUARE-ERROR AND NORMALIZED 

LEAST-SQUARE-ERROR SOLUTION 14 

B. RECURSIVE RELATIONS 25 

III. THE BEST APPROXIMATE SOLUTION 5 7 

A. THE PSEUDO INVERSE 5 8 

B. RECURSIVE ALGORITHM FOR THE 

SEQUENTIAL LEAST-SQUARE-FIT 66 

C. ESTIMATING THE STATES OF A LINEAR 

DYNAMIC SYSTEM 81 

IV. FINITE ITERATION METHODS 85 

A. INFINITE ITERATION PROCEDURE 86 

B. FINITE ITERATION PROCEDURE 88 

C. MATRIX PSEUDO INVERSION 98 

V. RECURSIVE ALGORITHM FOR THE SLIDING- 

WINDOW OBSERVER 10 3 

A. THE MINIMUM-WINDOW OBSERVER 105 

B. SLIDING-WINDOW OBSERVER FOR TIME- 

INVARIANT SYSTEMS 110 

VI. SUMMARY AND RECOMMENDATIONS FOR FURTHER 

STUDY 119 

APPENDIX A 122 

A. ITERATIVE SOLUTION OF A SET OF 

NONLINEAR EQUATIONS 122 

B. SOLUTION OF THE DYNAMIC RESPONSE OF CIRCUITS 

CONTAINING NONLINEAR RESISTIVE ELEMENTS 134 

LIST OF REFERENCES 139 

INITIAL DISTRIBUTION LIST 141 

FORM DD 1473 143 



3 



ACKNOWLEDGEMENT 



The author wishes to express his appreciation to Dr. 
Sydney R. Parker for his encouragement, guidance and 
assistance in this work. 



4 



INTRODUCTION 



In a broad sense this dissertation is concerned with 
the basic problem of solving the linear matrix equation* 



Ax = b 



( 1 . 1 ) 



where A is an m x n matrix, x an n x 1 vector of unknowns, 
and b an m x 1 vector of constants. The solution, if A has 
rank r = n, is straightforward and reduces essentially to 
matrix inversion 



x = [A T A] A T b 



( 1 . 2 ) 



ip <p -1 

where A is the transpose of A, [A A] is the inverse of 

iji /\ 

A A, and where x denotes the solution of (1.1). However, 
when the rank of A is less than n, the set of equations 
(1.1) is underdetermined and infinitely many solutions 
exist. In order to select a unique solution out of all 
possible solutions further constraints are imposed. In 
the work presented here the minimum-norm solution, as de- 

/N 

fined by Penrose [5] , is selected. If x denotes the 
selected solution, and x^ any other possible solution, then 



*11 < llxj 



(1.3) 



where llxll = trace xx . The solution of (1.1) is further 



Throughout this dissertation a bar under a lower case 
letter represents a column matrix or vector. Capital 
letters generally refer to matrices. 
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complicated when the set of equations is inconsistent so 
that there is no solution which satisfies all equations. 

In this case a compromise solution has to be accepted 
such that all equations are satisfied as close as possible 
according to some criterion. Usually this compromise 
solution is selected by minimizing an error criterion J. 



where e = Ax - b. The most commonly used criterion is the 
least-square-error sum 



When this criterion is combined with the minimum-norm con- 
dition (1.3) a unique general solution results which 
Penrose [5] has defined as the best approximate solution and 
which is obtained using the concept of the pseudo inverse. 

This general approach to the solution of (1.1) is par- 
ticularly useful when applied to discrete-time system 
problems such as state estimation, parameter identification, 
and the limited memory observer problem as discussed in the 
body of this dissertation. As an example, consider the 
problem of estimating the states of a linear, dynamic system 
from noise-contaminated measurements. The dynamic system 
and the measurements are given by 



J = f (e) 



(1.4) 



T T 

J = e e = trace e e 




(1.5) 



th 

where e^ is the i element of the vector e 




(1.6a) 
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(1.6b) 



z k = M k *k + V k 



where is the system state vector, ]^_i the discrete, 
time-varying transition matrix, a time-varying observation 
matrix of dimensions 1 xn, the measurement noise, and z^. 
a scalar observation. After k observations the data may be 
arranged as follows 



r 

z 



1 



M 1 $ 



l,k 



M 0 
2 2 ,k 



^k 



z 



k 




(1.7) 



Thus the state estimation problem for linear, discrete-time 
systems is reduced to the problem of solving (1.1). When 
the transition matrix is the identity matrix this problem 
reduces to estimating a constant but unknown vector x. The 
estimation problem has been considered for many years by 
such famous mathematicians as Gauss [1], Penrose [4,5], 
Kalman [2], Deutsch [1] and many others. In spite of their 
differing approaches the underlying concept remains the 
solution of (1.1). 

The problem of identifying the coefficients of the 
recursive difference equation describing a time-invariant, 
linear system from a sequence of noisy response measure- 
ments can also be reduced to the solution of (1.1). Pre- 
vious investigators have solved this identification problem 
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using different methods such as correlation functions, 
deconvolution techniques, adaptive model techniques, and 
others as discussed by Mishkin and Braun [17], Eveleigh 
[18], Eykkoff [19] and many others. The approach using 
the problem formulation of (1.1) has been considered by 
R. C. K. Lee [3] . 

Thus, it has been well established that it is possible 
to solve discrete estimation or identification problems 
using the concept of the best approximate solution of (1.1). 
For real-time computation, as required for example in some 
self-adaptive control systems, it is desired to obtain 
numerical results sequentially with a minimum of computation 
time and data storage. Since the dimension, m, of matrix A 
grows at each step in time as additional data is acquired, 
it is desirable to formulate a sequential solution to (1.1) 
in recursive form. Almost all known algorithms are based 
upon the assumption that matrix A has rank r = n whenever 
m >. n, which might not be true in general. The algorithm 
developed in this dissertation solves (1.1) sequentially 
for the best approximate solution [5] without any assump- 
tions as to the rank of A. The normalized least-square- 
error solution and an algorithm, an alternate formulation 
based upon a different error criterion, are developed and 
applied to an estimation and an identification problem. 

For completeness, recursive non-sequential forms for 
obtaining the Penrose inverse and the best approximate 
solution of (1.1), when the matrix A has constant dimensions. 
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are included. Finally, efficient procedures for sequential 
state estimation of time-invariant systems are presented, 
where the state estimation is obtained from a finite but 
continuously updated set of observations (sliding-window 
observer) . These results fall into the category of linear 
observer theory as discussed by Luenberger [11] and Bona 
[ 12 ] . 

The development and discussion of the foregoing results, 
including geometric interpretation and examples, are pre- 
sented as follows. In Chapter II, Eqs . (1.1) are considered 

to be a set of overdetermined equations. The closed form 
solution, as well as the recursive solution (Kalman type 
filter) , are well known. However, a geometric interpre- 
tation of the known results suggests an alternate way of 
selecting the compromise solution using a weighted least- 
square-error criterion. This solution is defined here as 
the normalized least-square-error solution. The normalized 
least-square-error solution in recursive form, which requires 
only a slight modification of the least-square-error 
algorithm, is applied to a specific estimation problem as 
well as to an identification problem. The latter consists 
of identifying the coefficients of a difference equation 
describing a dynamic, linear, time-invariant system from 
system response data. The experimental results are quite 
favorable for the normalized least-square-error solutions. 

For the identification problem it is shown that with 
sinusoidal excitation the normalized least-square-error 
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solution results in a smaller bias error. However, these 
results cannot be generalized and whether the normalized 
least-square-error algorithm should be used depends 
entirely upon the specific problem under consideration. 

In Chapter III the general solution of (1.1) when the 
rank of A is less than or equal to n is discussed. First 
the definition and some properties of the pseudo inverse 
and the best approximate solution according to Penrose 
[4,5] are stated and some alternate expressions for the 
pseudo inverse are discussed. Then a complete recursive 
algorithm for the best approximate solution for the general 
case is developed for use as a sequential-estimation pro- 
cedure. The algorithm presented here has the advantage 
over previously published results in that the dimensions of 
the matrices involved in the algorithm remain constant 
irrespective of the dimension, m, and rank of A. It is 
interesting to note that the form of the resulting filter 
equation 

* k - Bk-i + a <z k - s T £ k -i> u.4) 

evolves naturally by solving the necessary equations and 
is not assumed a priori. Finally, the algorithm is adapted 
for state estimation of linear, time-varying dynamic 
systems . 

The solution of a fixed set of equations (1.1) is con- 
sidered in Chapter IV. Although it is possible to obtain 
the solution with the algorithm of Chapter III, this method 



10 



is not very efficient because only the final solution, with- 
out intermediate sequential estimates, is required. More 
efficient methods are obtained here by combining an infinite 
iterative error correcting method, as developed by Noda and 
Nagumo [10], and the Gram-Schmidt orthogonalization process 
[9]. The results are finite-step algorithms for the solu- 
tion of matrix equation (1.1), matrix inversion (when the 
matrix is nonsingular) , and matrix pseudo inversion for the 
general case. 

In Chapter V the sequential solution of a growing set 
of equations (1.1) is considered again. However it is 
assumed a priori that the set of equations is consistent 
and that any subsequent sets of n equations in (1.1) are 
independent. The problem is then solved with the ultimate 
goal of developing finite-memory, sliding-window observers. 
First, the general case of time-varying linear systems 
is considered and a new algorithm for the minimum-window 
observer (an observer with a memory limited to exactly 
the minimal set of data) is developed. However the high 
sensitivity of this observer to measurement noise severely 
limits its use [12]. Introducing the further constraints 
that the system and the observation matrix are time invariant 
leads to more useful and very efficient results. It is 
shown that with these constraints it is possible to construct 
simple and efficient filters for state estimation from noise- 
contaminated measurements using the results for the minimum- 
window observer. Also, using the concept of the pseudo 
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inverse, the memory of the observer may be extended so that 
a sliding-window observer of arbitrary length SL >. n is 
obtained. In this case the algorithm for state estimation 
reduces to 



2* - F l k -i + 2 z k (1 - 5 > 

where F and g are constant. The performances of these 
filters when processing noisy measurements is illustrated 
with an example. 

Finally, in Appendix A, the solution of a set of non- 
linear equations using the results of Chapter IV is 
attempted. An iteration scheme is presented in which the 
value of the total difference quotient for the nonlinear 
part of the set of equations is iterated sequentially to 
its true solution. Computational results are presented for 
examples where this iteration process converges while other 
commonly used methods, such as Newton-Raphson , Gradient, 
and Linear Interpolation fail. This iteration scheme is 
then proposed for the solution of sets of nonlinear 
differential equation for networks containing nonlinear, 
memory less, dissipative elements. 
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II. THE NORMALIZED LEAST- SQUARE-ERROR SOLUTION 



In this chapter the solution of a set of overdetermined 
simultaneous linear equations is considered first, using 
the concepts of the pseudo inverse as developed byPenrose 
[4,5]. The usual least-square-error solution is presented 
and interpreted geometrically. It is then shown that an 
alternate solution, which has been designated here as the 
normalized least-square-error solution, is also possible 
and leads to a different geometrical interpretation. These 
results are then applied to an estimation problem, and a 
recursive algorithm, based upon the normalized least-square- 
error solution, is developed. The resulting equations are 
similar in form to the Kalman [2] type of discrete filter 
as discussed by R. C. K. Lee [3] , but differ substantially 
in their precise formulation and the nature of the results. 

A numerical-estimation example comparing the least-square- 
error filter with the normalized least-square-error filter 
is presented. The recursive equations are then applied 
to the problem of identifying the coefficients of the 
difference equation describing a dynamic system from a 
sequence of noisy measurements of the system's response. 

The results of a numerical example are presented and com- 
pared with the results obtained using the usual least- 
square-error filter. These results indicate that the error 
in coefficient identification may be less for the normalized 
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least-square-error solution, as defined, than for the usual 
least-square-error solution available in the literature. 

An alternate approach to the solution of the identifi- 
cation problem is to use estimates for the past system 
response rather than the past observations themselves in the 
problem formulation. As shown in Example 2.3 this procedure 
may result in a better identification of the system co- 
efficients. Finally, it is demonstrated that in the 
presence of measurement noise a bias error in the identifi- 
cation begins to build whenever the input function is 
constant. However, when the input function is causing a 
significant dynamic system response, the estimation error 
approaches a constant bias. The analysis of bias error is 
performed by approximating the discrete formulations in 
the continuous time domain so that a limiting process can 
be performed readily. These results are verified experi- 
mentally . 

A. LEAST-SQUARE-ERROR AND NORMALIZED LEAST- SQUARE-ERROR 

SOLUTION 

The most common method of solving a set of m simul- 
taneous equations in n unknowns, where m > n, is the least- 
square-error solution. The problem consists of solving 
the algebraic relationships 

b = Ax (2.1) 

where A is the m x n matrix of coefficients 
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b is the m x 1 vector of constants 
x is the n x 1 vector of unknowns. 



A solution for x is the best approximate solution introduced 
by Penrose [5] . 



x = 



A + b 



( 2 . 2 ) 



where A + is the Penrose pseudo-inverse, and x is the best 
approximate solution. If the matrix A has rank n, this 
solution is obtained as* 



x 



[a t a] 



-1 



A b 



(2.3) 



T t 

where a denotes the transpose of A and [A A] the inverse 

T a 

of [A A] . In this case the solution x minimizes the cost 

function J 



J 





(2.4) 



where e = Ax 



b 



(2.5) 



and e^ represents the components of the vector e_. Thus 
this solution satisfies all the equations of (2.1) as close 
as possible in the least-square-error sense which is shown 
as follows. 



★ 

If A has rank less than n (i.e., the rows of A are not 
independent) the pseudo inverse has a different form as 
discussed in Chapter III. 
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The minimum of Eq. (1.4) occurs when 



3 J ~ T 

— =2[Ax-b]A=0 (2.6) 

9x — 

or A T Ax = A^b (2.7) 



Therefore 

/ 



/\ m -L rp 

x = [A A] A b 



( 2 . 8 ) 



A geometric interpretation of the least-square-error 
solution may be obtained by considering the two dimensional 
case where the vector x has two components x^ , x 2 . Each 
equation of (2.1) represents a line in the x-^, x 2 plane. 
These lines generally do not intersect at a single point. 
Now consider the sequence of lines given by 



Ax = b - e 



(2.9a) 



where e is a vector with arbitrary elements, e.. These 
lines are parallel to the original lines but shifted in 
the orthogonal direction by the amount, s^ , where 

s i = (a^? + a 2 ?) e i ' i = 1,2, ...m (2.9b) 

and a^j are elements of the matrix A. In the least-square- 
error solution the lines are shifted so that they all pass 
through the point x^ , x 2 so that the cost function, J 
m „ 

J= E e z (2.9c) 

1=1 1 

is minimized. 
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A different result may be obtained by considering each 
equation of (2.1) as a permissible geometric locus for the 
desired solution point. In general this locus is a hyper- 



plane in n dimensional space, where n is the number of 
elements in the unknown vector x. If all the loci intersect 
in a single point this solution corresponds to that of 
(2.3). If they do not intersect in a point, the solution 
point may be defined as the point which minimizes the sum 
of the distances squared to each locus. This solution is 
defined here as the normalized least-square-error solution. 
Its interpretation is quite different from the least-square- 
error solution and in certain types of identification prob- 
lems is seen to be more meaningful than the usual results 
available in the literature. 

The normalized least-square-error solution is a weighted 
least-square-error solution with weighting factors chosen 
such that the solution point lies as close as possible to 

all geometric loci described by (2.1). This result can be 

/\ 

obtained by selecting the solution x* such that the scalar 
cost function 



is minimized, where the d. 's are the distances from x* to 

l — 

the respective loci designated by the subscript i. Before 
deriving this solution it is desirable to prove the 
following. 



m 




J* = E 
i=l 



(2.10a) 
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The distance from a point 



P*(x 1 *,x 2 * ... * n *) to the plane 




c 



(2.10b) 



is given by 



| d | = | (a^T + a^ +...+ a^ ) (a.x * + a„x * + . . . +a x * - c) | 
'12 n 11 22 nn 



( 2 . 11 ) 



Proof of Eq. (2.11) is given for the three dimensional case. 
Extension to higher dimensions is obvious. 

Proof 

Let the point P*(x*,y*,z*) and the plane c = a^x+a 2 y+a 3 z be 
given. The unit vector normal to the plane is found by 
considering a plane through the origin parallel to the 
given plane. Its equation is 



which is the dot product of two vectors, one the position 
vector r to any point in the plane 



0 = a^x + a 2 y + a^z 



(2.12a) 



r = xi. + y£ + z4 



(2.12b) 



and the other a vector normal to the plane 



n = a^x + a 2 £ + a^ ( 

where X_, and *1 are unit vectors defining the three- 



(2.12c) 



dimensional cartesian space. The unit normal vector is then 



( 2 . 12d) 




3— 



where 

“i = U 1 + a 2 + a 3> 2 a i ' 1 = 1 ' 2 ' 3 

The distance |d| is then given by (See Fig. 2.1, where N is 
the point in the given plane closest to the given point P* 
and NP* denotes the vector from N to P*) 

t d | = | (r - r*) • *n) | • ( 2 . 12f ) 



where r is the position vector of a point in tie given plane and 
r* is the vector from the origin to the point P*. 

Since 

r •?} = a^x + a 2 y + a^z = (a^ + a^ + a^) c (2.12g) 

and 

r*-71= (a^ + a^ + a^) 2 (a 1 x* + a 2 y* + a 3 z*) (2.12h) 

it follows that 

_ig 

| d | = | (a^ + a 2 + a^) (a^* + a 2 y* + a^z* - c) | (2.12i) 

Extension to higher dimensions follows the same arguments 
as above and leads to the general result of (2.11). 

The solution to the equations (2.1) which minimizes the cost 
function (2.9) may be obtained by considering a slight 
modification of (2.3). Consider the vector d with elements 
d^ given by 

d = WAx - Wb (2.13a) 
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FIG. 2.1 DIAGRAM FOR PROOF OF£q. (2.11) 
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where W is a diagonal mxm matrix of weighting or normalizing 



factors w^ given by 





n / T 

w i;L = (_Z^ a i j) f (a.^) , i=l , 2 , . . . ,m 


(2.14) 


Eq. (2. 


13a) may be rewritten as 






d = A*x - b* 


(2.13b) 


where 


A* = WA 


(2.15) 


and 


b* = Wb 


(2.16) 



The solution of (2.13b) in the least-square-error sense is 
the desired solution, namely 

x* = [A* T A*] _1 A* T b* (2.17) 

It should be emphasized that this method solves a different 
set of equations (2.13a), derived from the original set of 
equations (2.1) by normalizing each equation, using the 
standard least-square-error-solution. Thus the normalized 
least-square-error solution is a weighted least-square- 
error solution of (2.1). This is quite different from 
other possible solutions of (2.1) minimizing alternate 
cost functions [18,21], i.e. 




where p is an integer or 
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m 

= Z 
i=l 



e i 



The following example demonstrates that the solutions (2.3) 
and (2.17) may differ appreciably. 

Example 2.1 : Solve the set of 4 equations in 2 unknowns: 



Given b = Ax 



n r " 1 1 



0 

, 1 ' 


A , 10 


1 

10 


w 


L-io 


10 



According to (2.3) 




.0 

.099 



According to (2.8) 




.0 

.05 



These results as well as the lines defined by the above 
equations are shown- in Fig. 2.2. 

This new approach to the solution may yield more meaning- 
ful results where, the indiscriminate use of the least-square- 
error procedure leads to unexpected or undesirable results. 
Consider for example the problem of estimating the parameters 
of the semiconductor diode model [14] from measured electri- 
cal data. If the diode is forward biased the model essen- 
tially reduces to a resistor R, representing the combined 
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+ LEAST SQUARE ERROR SOLUTION 
0 NORMALIZED LEAST SQUARE ERROR SOLUTION 



FIG. 2.2 EX. 2.1 
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body, lead, and contact resistance of the actual diode, 
in series with an ideal diode given by 

V D = k Jin ~ ~ k£n(I g ) . 

V D is the voltage across the ideal diode, I the diode 

current, I the reverse saturation current and k the 
s 

characteristic constant (for constant temperature) of the 
diode. If V represents the voltage across the actual 
diode, the model is described by 



V = IR + kAn(I) - kAn(I ) . 



An estimate for the constant parameters R, k, and 
-k£n(I ) is obtained from £n measurements of voltage and 

S \ 

current for the forward range of the diode. The measure- 
ments may be written in the form 



n 



V, 



V, 



V 



m 



1 I ± Jlnd 1 ) 

1 I 2 £n(I 2 ) 

• * * 

r • t> 

• • 

1 I An(I ) 

m m 



-kAn(I ) 
s' 



R 




The least-square-error solution results in a model which 
is very poor for small currents and excellent for very large 
currents , because the weights attached to the measurement vectors 
[1 I i £n(I^)] may differ by orders of magnitude. Therefore 
the normalized least-square-error solution, which weighs 



2 4 



'? 

each measurement vector equally, is preferred, resulting 
in a model which describes the actual diode adequately for 
the entire forward range. 

B. RECURSIVE RELATIONSHIPS 

1 . Development of the Recursive Relations for an 
Estimation Problem 

In estimation* * problems the least-square-error 
solution is widely used in the form of a recursive relation 
for sequential estimation where in addition to a set of 
solved equations one new equation is considered and its 
data taken into account**. A typical example of such an 
estimation problem is to determine the set of initial 
conditions for a dynamic system from a sequence of discrete 
observations. Consider the system equation given in dis- 
crete form as follows: 



- <HT> Z k _! 



z k “ M 



(2.18) 

(2.19) 



where and y^_^ denote the state vectors at discrete 
times kT and (k-l)T respectively. $ (T) is the known 
transition matrix of the system for the discrete time inter- 
val T. M is the known observer matrix of dimensions 1 x n 
and Zy. is the scalar observation at time kT. If the state 



The term estimation is used to designate problems in- 
volving the solution of (2.1) where the elements of A are 
known exactly. 

* * 

See Ref. 3, page 51. 
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vector at k = 0 is denoted by the vector x the observations 



can be listed in the following form: 



z 




M 


o 




— — — 


Z 1 






• 


= 




• 




• 


• 




— — — 


_ z k_ 




MA k-^ 
_M$ _ 



( 2 . 20 ) 



Estimation of the vector x is equivalent to solving (2.1). 
Thus in general form (2.20) may be written as 



z, = A. x (2.21) 

— k k— 

If Aj, is of maximum rank, Lee [3] defines (M,<J>) as an 
observable pair and (2.21) has a least-square-error solution 
according to (2.3): 



/\ 




T 

P, A, z, 
k k— k 



( 2 . 22 ) 



where z, is the vector of k observations 

A^ the k x n coefficient matrix, A^= [M^l J . . . ^ 

P, the 
k 

/N 

x^ the least-square-error solution for x based upon 
the k observations 

Now consider the (k+l)st equation 

z k+1 = a T x (2.33) 



. T 

inverse of the matrix A^A^ and 



where z^ + ^ is the new observation or data in the vector 

T T k+ * 

and a a row of new coefficients. a is equal to M$ for 
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the estimation example above. It is now possible to define 



the matrix A, , and the observation vector z, , , . 

k+1 • — k+1 



k+1 





A kl 


z == 


^k 




T 

a 


' -k+1 


_ z k+l_ 



(2.24) 



The least-square-error solution then takes the form 



-k+1 P k+1 A k+1 -k+1 



,T. 



where P, , . is the inverse of the matrix [A" '.A. 

k+1 k+1 k+1 



(2.25) 
] . This 



result can be written in the following recursive form [3]. 

(z. 



= x, + 



P k^ 



-k+1 -k ’ . . T_ 

1+a P, a 
— k— 



T , \ 

"k+1 - -k } 






^ P k+1 P k 



P k-— P k 
l+a T P k a 



(2.26) 



It should be noted that (1+a P. a) is a scalar and is 

— k— 

treated accordingly. The derivation of (2.26) normally 
available in the literature is rather involved and a short 
derivation which has been developed here is presented. 

— 1 T 



P, = A, A, 
k k k 



(2.27) 



-1 T T T “1 T 

P k+1 = A k+1 \+l = A k A k + -- = P k + -- 



- 1 t -1 

P = [I + a a P, ] P, 



k+1 



-- k 



(2.28a) 
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After premultiplying by P k+ ^ and postmultiplying by P^. 
(2.28a) becomes 



P, = P, . . + P, , . a a T P, 
k k+1 k+1 k 



(a. 28b) 



Combining (2.28b) with (2.28a) yields 



p k + l ■ p k - p k fI + — — T p k^ _1 — — T P k 



(2.29) 



T “1 

The expression [I+aa P, ] a may be simplified. 



T - 1 

Let £ = [It a a P ] a 



then £(l+a T P k a) = [I + aa T P k ] "'’a (1 + a T P k a) 



[I + aa T P k l [a + aa T |a] 



[I + aa T P k ] _1 [I + aa T P k ]a 



= a 



Thus ‘ = a/(l + a p k ^) and (2.9) reduces to 



P = P - 
k+1 k 



p k^ Tp k 

T 

1+a P. a 
— k— 



(2.30) 



Then 



-k+1 



P k+1 A k+1 -k+1 



P k - 



p k^ T V 

l+a T P a 
— k— 



A k— k + z k+l^ 



= P, A. z, - 
k k— k 



P k* 



T 

1+a P, a 
— k— 



T T / P k— — P k a ' 

a P. A. z, + z. .. P. a 
- k k-k k+1 \ k- . , T_ I 

1+a P. a 
— k— 
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= *k - 



p k^ 



T 

1+a P, a 
— k— 



* *k + z k+l 



P k^ 

l+a^'p a 
— k— 



Sk+l - Sk + 



p k^ 



T 

1+a P, a 
— k— 



(z k+i ' ^ t 4> 



(2.31) 



(2.30) and (2.31) are the desired equations of the recursive 
form (2.26) . 

By comparing (2.3) and (2.17) the recursive relationship 
for the normalized least-square-error solution follows 



= x* + 



P*a* 

k— 



-k+1 -k 



P* = p* - 

k+1 F k 



1+a* P*a* 
— k— 



(z k +1 



P*a*a* P* 
l+a* T P*a* 



'T'/N \ 

a* x*r 






(2.32a) 



Now because of the normalization 



T 

Z k+1 = ( * Z k+1 



a* = (a T a) a 



(2.32b) 



Substituting (2.32b) into (2.32a) yields 



-k+1 



P * 

F k+1 



X* + 

-k 



p £ 



P k— 



m m 

a a+a P£a 



p £*° Tp £ 

a T a+a T P*a 



(a 



k+1 



T~ . . 
a x*) N 



(2.32c) 



From (2.32c) it follows that explicit normalization of each 
equation is unnecessary. Instead a slight modification of 
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(2.26), as given in (2.32), results in the desired algorithm. 
It should be noted that (2.32) is not valid for the meaning- 
less observation with the coefficient vector a = 0_, which 
must be excluded from the recursive procedure. For this 
case 



X * 

— k+1 



p* 

k+1 




for a = 0 




(2 . 32d) 



In the following example sequential estimates for a 
vector of constant elements are computed from noise con- 
taminated measurements using both the least-square-error 
and the normalized least-square-error procedure. The 
measurement noise is derived from a noise population with 
zero mean and a distribution of finite extent (i.e. uniform 
distribution), rather than from a normal distribution, in 
order to conform with practical problems where the largest 
possible measurement error is bounded. 



Example 2.2 : Estimate the unknown vector x of dimensions 

2x1 from the scalar measurements given as 



z k ■ + \ 



(2.33a) 



where 



k-1 2 

M k = [1, <V> > 



(2.33b) 



is a time varying observation matrix, z^ the scalar ob- 
servation and the measurement noise. At time instant k 
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the k equations, according to (2.33a), may be written in 
matrix form as 



z, = A, x + v, 
— k k— — k 



(2.33c) 



where 



A k = 



1 

1 



0 

1/4 



(kzl ) 2 

1 k ' 



(2.33d) 



-k [z l z 2 **• z k ] 



(2 . 33e) 



v k = [v 1 v 2 ... v k ] 



(2 . 33f ) 



Since x must be estimated from noisy measurements, Zy , it 
is necessary to solve the equation 



z, = A. x 
— k k— 



(2 . 33g ) 



The solution of (if. 33g) for the estimate of x is given by 



x, = P, A, z, 
— k k k— k 



where P k = (A^A^.) 1 



( 2 . 3 3h ) 
(2 . 33i) 



The estimation error is then defined as 



and the measure- 



Results for a specific case, where x = 
ment noise is a sample from a uniform noise population 
with maximum deviation ±0.1 and zero mean are shown in 
Figure 2.3 and' Figure 2.4. The sum e of the absolute 
estimation errors 



e = 




+ 




(2. 331 ) 



~ 1 ~2 _ 

where x^ and x^ are the two elements of x^ is shown for 

both estimators in Figure 2.5. Note that the estimation 

error does not approach zero. 

The experimental result of Example 2.2 may be verified 

by considering a similar estimation problem where the 

sequence of observation matrices is given as 




(2.33m) 



The matrix P^ in (2.26) for this problem takes the form 



P 



k 



1 

-1 




for k > 2 



In the limit as k goes to infinity 



Mm = 
k-*-°° k 




-1 

1 



(2. 33n) 
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and 



£im ~ 
k-^oo *k 



1 -1 

■1 1 



1 1 
0 1 





V.. 




i 




1 








V 2 




-i 



v x (2.33o) 



Since is a random variable, it follows from (2.33o) that 



£im 

k->°° 



(Prob (x-x) = 0} = 0 



and the estimation error will approach a constant bias 
dependent on the value of the first noise sample. In 
Example 2.2 the resul,t is quite similar and the estima- 
tion error will approach a constant bias with probability 
equal to 1. This bias is dependent mainly upon the first 
few noise samples as shown in the following. 



T 

A A 



k 



k 

2 (1 
2 




2 



k -I 2 k , 4 

E (1~) E (1 - i) 

2 1 2 1 



and P, = 



k 

E 

2 



u4> 4 

l 



1 2 

1 Z (l-±) 
k 2 1 



k 4 

1 1 

* 2 (1 ’ r) 

1 k ^ 2 

i * (1 -r> 



1 k x 2 
2 (1 ’ r> 



( 2 . 33 p ) 

In the limit as k goes to infinity the elements of P^. do not 
go to zero but approach constant values. An approximation 
of P^ may be obtained by converting the piecewise- 

constant functions in (2.33p) to continuous functions and 
the summations to integrations. Thus 
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Since 



If 4 If — ">;■ A 

Aim 1 * , 1 . Aim 1 7 2 1 * v 

1 L ( 1 — -) , 7 - J ( 1 — ) dx 

k-*-°° k o l k-*-°° k _ , x 

Z 2 -h 



)iim rl, _ 4£ nx _ *L + i — 

k+~ l k l x 2 3x 3 



= 1 - “ lk ^(k+. 5 )l 



= 1 



Aim 1 „ .. . 

, r~ An (k) 
k->“ k 



A 1 m f 1 If/-* 1 * , I/-* 1 * 

k-~ {k 1 (1 -k> + 2 (1 -k> + 



Also 



and 



= 0 



, . , k , 2 „ . 1 k+% , 2 

Aim 1 „ 1 . , Aim 1 f ,, 1 . , 

. E ( 1 — ) dx - . 7- } ( 1 — ) dx 

k->°° k 2 x k -*- 00 k j x 



^ im {b [x- 2 Anx - -] } 

k->-°° k 1 x 1 5 



= 1 - 2 ij? * n <k+ - 5)1 



= 1 



n ' k -1 4 k -1^2 

( 1 -r» - k « «i-r» > > 

2 z 



Aim '■ . 1 . 4 , 

k - <U (1 “’ dx 



1 k+^ 12 ^ 

r [/ d") dx] } 

k 2 - 3 g x 



k+ . 5 
] > 
1.5 
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£im 

k-*- 00 



k+ . 5 



k+ . 5 2 



{ [x-4£nx - ^ ~] -^-[(x- 2£nx - |) | ] } 



x 3x 1.5 



1.5 



- -(1.5 - 4£n (1 . 5) - + 2 



IS 2 3 

1.5* 3x1 . 5 J 



2 2 
+ ki« { (k+ . 5) - 4£n (k+.5) - + 4 ■ - In (k+. 5)- ^[£n (k+.5)] } 



0 00 , £im 4 r „ , , 
“ 3.33 + k [£nk] 



- 3.33 



Since 



| [£n (k)]^= 4 [k ^£n(k)] 2 



= 4 <*“*[&-£> + 1 



= 0 



Thus 



£im 

k-»-» k 



3.33 



1 -1 

-1 1 



(2 . 33q) 



and also 



£im Um _ .T 

i x, = . P, A, v. 
k^-oo _k k-*-°° k k— k 



1 

-1 



E i (2-i) v . 

i=l 1 1 



(2 . 33r) 



1 

rlj 



{v l + 4 V 2 + 9 V 3 + 16 V 4 + 25 V 5***^ 
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Eq. (2.33)r is a weighted sum which places emphasis on the 
first few noise samples and whose weighting factors approach 
zero as k becomes larger and larger. Thus for large k 
the estimation error depends mainly upon the first terms of 
the summation in (2.33) r and approaches a constant bias. 

As a consequence of Example 2.2 the assertion in Lee 
[Ref. 3, page 53] that = [0] is contradicted and the 

estimate obtained is not consistent. 

2 . Application to Identification Problems 

The foregoing results may also be used in identi- 
fication* problems. Consider a discrete-time system char- 
acterized by the discrete equation 



x k ■ 



a l x k-l +a 2 x k-2 + ' 



+ a x, +b n u, T+b,,u, -+...+b u, 

n k-n 1 k--l 2 k-2 m k-m 



(2.34) 



where in general n m, and x^._ n and u^_ n represent the 

system response and driving function at time t = (k-n)T. 

It is desired to determine the coefficients a. and b. 

i 3 

(where i=l,2,...n and j=l,2,...m) from sequential measure- 
ments of the output. If the measurements z^ are noiseless 
then 



x k = Ix k-i 



x, u, . 

k-n I k-1 



u. 



k-m 



3n_ 



after k = n+m measurements the following data bank will 
result if the system starts with zero initial conditions. 



j)f 

The expression identification is used to designate 
problems involving the solution of (2.1) when some or all 
of the elements in A are random variables. 
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r 

i— i 
N 

1 




Z 2 

♦ 

• 


= 


• 

z , 
n+m 





x 0 o 



0 

0 



x , , x , „ x 

n+m-1 n+m-2 m 



u. 



u 1 



u 



u , , u ^ „ 

n+m-1 n+m-2 



0 

0 

, u 



n 



n 
3 i 

m 

(2.35) 



or 



z = [x ^ : u ^ ] 

— n+m n+m ; n+m 



a 

. 32 . 

b 

— m 



(2.36) 



The identification problem is then to solve (2.35) for the 



a i 

n_ 

b 

— m 



An exact solution is obtained only if the 



vector 
matrix [X. 

is then possible using (2.26) or (2.32) when the number of 



n+m U n+m^ nons i n 9 u l ar * Sequential estimation 



observations K > m+n and the matrix [X , U , ] has 

— n+m j n+m 

maximum rank equal to m+n. 

In the presence of measurement noise the observations 
of the output become 



Z k+1 X k+1 + V k+1 



where v. . is the measurement noise. 
k+1 



be approximated by 



Eq. (2.35) may then 




(2.37a) 
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or 






= [Zt 



V 



— n 

"b 
— m 



(2.37b) 



This solution is proposed by R.C.K. Lee [3]. 

An alternate approach is to use an estimate for the 

element x. in (2 . 35 ), denoted by x. in the matrix X, . Thus 
1 l x 



/s 





x. 

l-n 



u i-i 



A 




(2.38) 



This approach, as shown in Example 2.3, has a distant advan- 
tage over the solution of (2.37b). This example also 
demonstrates the difference between the least-square-error 
and the normalized least-square-error solutions as given by 
(2.26) and (2.32). 

Example 2.3 : Identify the coefficients a^ and b 1 in the 

difference equation 



x k + i = a iV b i u k 

from noise corrupted measurements z k+1 , 

z, , i = x, , . + v, . . 
k+1 k+1 k+1 



where 



(2.39a) 



(2.39b) 



and is a sample of the measurement noise with the follow- 
ing statistical properties 



E 



{ V . > = 0 
1 



E (v i 





6 . . 

i/D 



<° 

1 i=3 



(2.40) 
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2 

where E{ } denotes the expected value, and a is the vari- 
ance of the measurement noise. Specifically consider the 
discrete equation 

* k+1 = .9x k + .1 u k (2.41) 

For a given input sequence (u^ . . . u k ) and zero initial 
conditions, the output sequence {x Q ,x 1 ... x k > is generated 
using (2.41). The output sequence is then corrupted with 
noise, taken from a uniform distribution with maximum 
deviation ±0.1 with zero mean to obtain the sequence of 
noisy measurements {z q ,z^ z k ) which then are processed 

according to Eqs . (2.26) and (2.32). For both approaches, 

one using the z^'s and the other using the x^ ' s of Eq. (2.38) 
as elements in the coefficient matrix X, two computations 
are made - one where the input is a unit step and one where 
the input is a sampled cosine wave of unit amplitude. 

Typical results using the same measurement data for 
both estimators (the least-square-error and the normalized 
least-square-error) are shown in Fig. 2.6 through 2.17. 

Figs. 2.6 and 2.7 present the estimates for a-^ and b^ for 
the least-square-error and the normalized least-square- 
error solutions for a step input. Fig. 2.8 presents the 
magnitude of the total identification error e k = |a^(k)-a^| 
+ jb^(k)-bjJ for both cases. Figs. 2.9, 2.10, 2.11 present 
the identifications using the estimated past values for X k 
as given by (2.38). Figs. 2.12 through 2.17 present the 
corresponding data when the input is a cosine function. The 
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results indicate that the identification error is generally 
smaller when the normalized least-square-error method is 
used and that the identification error depends upon 
whether the input function causes a significant system 
response. In the following it will be demonstrated that 
for the step input the identification error approaches a 
very large value, compared with the parameters to be iden- 
tified, independent of the variance of the measurement noise 
while for the cosine input the identification approaches 
a constant bias dependent on the variance of measurement 
noise. Consider the least-square-error solution of (2.37b) 
which takes the form 



n 



1 



m 



{ [Z. ! U, ] T [Z. ! U. ]} [Z. j U. ] T z, 

k i k k k k| k J — k 



(2.42) 



where k >_ m+n. For Example 2.3 this may be written as 



t-k-1 9 

Z z7 
i=0 1 



k-1 
Z u . z . 



k-1 

Z u. z . 
i=0 1 1 



k : 1 2 

Z u. 



i=0 11 i=0 1 



-1 



-k-1 - 

Z z . z . 
i=0 1 1+1 



k-1 

Z u . z . , , 
i=0 1 1+1 



(2.43a) 
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or 



-1 



✓N 

a i 




fi k : 1 2 

r- Z Z . 

k i=0 1 


1 k ? 1 

7 - Z u . z . 
k i=0 11 




i k “ i n 

r- Z Z . Z . .. 

k i=0 1 1+1 


A 

b l 




1 k " 1 

Z u . z . 

L k i=0 1 1 


1 k " 1 2 
Z u . 

k i=0 1 J 




1 k ' 1 

L 1 i=0 1 1+ il 



(2.43b) 



In the limit as k goes to infinity the summations in (2.43b), 
provided that and x k remain bounded, take the following 
form 



Aim 1 
k^-°° k 



k-1 

Z 

i=0 



2 
z , 

l 



Aim 1^ 

k-*-°° k 



k-1 2 k-1 k-1 2 

[Z x.+2 £ x.v.+ Z v.] 

l . „ ii . „ i 

1=0 i=0 i=0 



Aim 1^ 
k+°° k 



k-1 

Z 

i=0 



2 
x . 

l 



+ 



a 



2 



Aim 1^ 

k-^-oo k 



k-1 

Z 

i=0 



u . z . 



i l 



Aim 1 _ 

k-*-°° k 



k-1 k-1 

[Z u. x. + Z u.v.] 
i=0 11 i=0 1 1 



Aim 

k-*-°° 



1 k " 1 

r- Z u.x. 

k i=o 11 



n • i k-1 

km 1 „ 

k->°° k . . i l+l 

i~0 



k-1 k-1 k-1 k-1 

, im f-[ Z X. X. - + Z X.V...+ Z X. , . V. + Z V . V . , ] 

k-* 00 k . - l l+l . . i l+l . n l + l l . l l+l 

i=0 i=0 i=0 i=0 



k-1 

Aim 1 K ± 

k-*-°° k x i x i+l 



o • i k-1 

Aim 1 v 

, Z u . z . , . 

k-*- 00 k . q i l + l 



Aim 1_ 
k-*- 00 k 



k-1 k-1 

[ Z u . x . + Z 
i=0 1 1 i=0 



u.v. , . 
i l+l 



] 



Aim 

k-S-oo 



1 k - x 

Z u.x, 

k i=o 11 
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Then for very large k the following relation is obtained 
from (2.43b) 



kl~ - 

E xf + ka z 
i=0 1 



k-1 

I u . x . 
i=0 1 1 



u-1 

E u . x . 
i=0 1 1 



k : 1 2 

E u . 
i=0 1 



T : k-1 1 

a. i E x . x . . . i 

1 | i-0 1 1+1 



(2.44) 



i k-1 

E u. x . , , 

U=o 1 1+1 



The corresponding expression for noiseless observations is 



or 
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u i X i+l_ 



(2.45) 



ka^a^ 



(2.46) 

The identification error for very large k is then obtained 
by combining (2.44) and (2.46) 
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(2.47) 



In the limit as k goes to infinity, Eq. (2.47) is easily 
evaluated by considering the corresponding continuous system 
with the transfer function 



x(s) _ 1 

u(s) s+1 



(2.48) 



Integration of (2.48) yields (2.39a) when the forcing func- 
tion u(t) is approximated as piecewise constant and the 
sampling time At satisfies the relation 




or 

| At | = | £n a^| (2.49) 

For the step input, when the system is initially at rest 



u ( t ) = 1 
x ( t ) = 1 - e 



(2.50) 



Then 
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ilim 
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(2.51) 



and 
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(2.52) 



(2.53) 



This demonstrates that identification of system parameters 
from a step input is only feasible as long as the system 
response is not close to the final value, as shown in 
Fig . 2.6. 
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Similar considerations yield the estimation error for 



the unit cosine forcing function. Using (2.48) and 



U(s) = 



s 2 +l 



it follows 

x (s) 



and 



_ -k 



(s+1) (s+1) 
u (t) = cos (t) 



ks+h 

s+1 2 , 

s +1 



also 



x(t) = (cos t + sin t - e 



k-1 t 

£im 2 £im . 2 ... £im 1 

i Ex. = , / x (t dt = . T t 

k-*°° . A l t->°° _ t-*°° 4 \ 

i=0 0 ' 

k-1 . t 

£im „ £im r ... ... £im 1. 

, E u.x. = . / u(t)x(t)dt = . t > 

k-*°° . „ ii t->°° n t-*°° 4 

i=0 0 



n • k-1 „ 

£im „ 2 

, E u . 

i=0 1 



£im r 2 ... , . £im 1, 

. f u (t)dt = . ^-t 

t->oo t-*°° 2 



Then from (2.47) 



£im 

k->oo 



~ a l~ 


_ £im J 


(^+a 2 ) t 


kt 


-1 
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-ta a^ 
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A- 


t->“ 




%t_ 




0 
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-c a^ 

1 1 2 
l6 + 2° 



k 



i\+0 2 ). 



“-1 ' 



1 + 



8c‘ 



1 

2 J 



2.54) 



2.55) 



(2.56) 



(2.57) 



(2.58) 
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The identification error as shown in Eq . (2.58) approaches 

a constant bias which is reasonably small for a large 
signal to noise ratio. For the specific example (Ex. 2.3) 
this bias as obtained from (2.47) for very large k is 



■V 




"-.0212“ 


- s l- 




.0106 _ 



for the least-square-error solution and 







“-.016 3“ 


1 

r-H 




_ .0054. 



for the normalized least-square-error solution. The total 
bias error reduction for the normalized least-square-error 
solution is approximately 32% compared with the least- 
square-error solution. This agrees with the experimental 
results obtained as shown in Fig. 2.14. 
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LFHST SQUARES 1 DENT 1 F 1 CRT 1 ON 

- STEP INPUT - UNCQRRECTED COEFFICIENTS 




2.00 0.02 2.50 10.02 12.50 



FIG 2 6 EX. 2.3 Nurarn swlf.5 u-ioj 



NORMRLJZEQ LERST 5QURRE5 1 DENT I F 1 CRT 1 ON 
- STEP INPUT - UNCQRRECTED COEFFICIENTS 




49 




50 
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FIG 217 EX 23 number of simples d-tot 



III. THE BEST APPROXIMATE SOLUTION 



In estimation theory, quite frequently one has to 
solve a set of inconsistent or insufficient specified 
linear equations. Since in these cases an exact or unique 
solution does not exist, an optimal or best approximate 
solution has to be accepted. Penrose [4,5] has defined 
this best approximate solution as follows. 

Definition 1 : X q is the best approximate solution of 

the linear equation 

f(X) = G, (3.1) 

where X and G are rectangular matrices, if for all X ^ X q 
either 

|| f (X) - G || > || f (X Q ) -G || (3.2a) 

or || f (X)-G|| = ||f (X Q ) -G || and ||x||>||X o || , (3.2b) 

where ||x|| denotes the norm of X defined as || X || - trace X X. 

In the discussion which follows, X is restricted to be 
a vector of dimensions n x 1 with real elements denoted by 
x. Then Eq. (3.1) may be written as 

Ax = b (3.3) 

where A is an m x n matrix and b is an m x 1 vector. 

The best approximate solution for x, according to 
definition 1, is the least-square-error solution if A has 
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maximum rank, and the minimum-norm least-square-error 
solution if A has rank less than maximum. This solution is 
obtained by using a generalized matrix inverse developed 
by Penrose [4], and denoted in the work which follows as 
the pseudo inverse. The solution to (3.3) is thus written 
as 

x = A + b (3.4) 

where x is the best approximate solution and A + the pseudo 
inverse . 

In the following the definition and some properties of 
the pseudo inverse, as given by Penrose, are stated. Then 
some alternate expressions for the pseudo inverse are dis- 
cussed. Finally a new recursive formulation for the 
sequential solution of Eq. (3.3) is presented. This form- 
ulation has the advantage over previously published results 
(Wells [6]) in that the dimensions of the matrices in the 
algorithm remain constant irrespective of the size and rank 
of A. A flow diagram for the computation of the algorithm 
is presented and illustrated with a numerical example. 
Finally, the recursive algorithm is adapted to the problem 
of estimating the states of time-varying, linear systems 
from noise-contaminated measurements. 

A. THE PSEUDO INVERSE 

1 . Definition and Properties 

Penrose [4] defines the following 



58 



Definition 2 : Four matrix equations are defined 



AYA = A 


(3.5a) 


YAY = Y 


(3.5b) 


* 


[AY] = AY 


(3.5c) 


* 


[YA] = YA 


( 3 . 5d) 



where * denotes the conjugate transpose. These equations 
have a unique solution for Y. This solution is called the 
pseudo inverse and is denoted by Y = A + . 

The essential feature of this definition is that any 
expression for the inverse of matrix A is established as 
the unique pseudo inverse if and only if it satisfies 
Eq . (3.5). As a consequence of definition 2 the pseudo 

inverse has the following properties 



A ++ = A 


(3.6a) 


A* + = A + * 


(3.6b) 


A + = A ^ if A is nonsingular 


(3.6c) 


(AA) + = A _1 A + 


( 3 . 6d) 


(A*A) + = A + A* 


(3 . 6e) 


A,A*A,A + ,A f A have rank equal to the 


trace of A + A 



In addition, for completeness, define [1] 

0 + = 0 T ( 3 . 6f ) 

2 . Alternate Expressions for the Pseudo Inverse 
It is desirable to be able to express this inverse 
by a mathematical formula and the following results are 
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essentially available in the literature as discussed by 
Deutsch [1], Koenig [6], et. al. 

a. Overdetermined case, m>n , r=n 

As shown in Chapter I, the solution (3.4) is 
obtained using 



A + = [A T A] A T 



(3.7) 



which corresponds to the minimum mean-square-error solution 
of (3.3). 

b. Underdetermined case, m<_n , r=m 

The solution (2.4) is obtained using 
■1 



A + = A T [AA T ] 



(3.8) 



which corresponds to the minimum-norm solution. 

Equation (3.8) satisfies definition 2 and is thus the 
desired pseudo inverse. The fact that the solution is the 
minimum-norm solution can be demonstrated geometrically for 
the three-dimensional case as follows: 

Given two equations in three unknowns 



a l a 2 a 3 
L b l b 2 b 3. 



L°iJ 



(3.9a) 



Find the minimum-norm solution for the unknown vector 
[x y z] T . 

Eqs . (3.9a) represent two planes 

r • a = c 1 (3.9b) 

r ♦ b = c 0 (3.9c) 
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where r is the position vector from the origin to the 
point (x,y,z) 

a a vector with components a^ , , a^ 

and b a vector with components b^ , b^, b^ 

Normal vectors to the planes are given by a and b 



Then any point on the line of intersection of the two planes 
satisfies (3.9b) and (3.9c) and the desired solution is the 
point on the line of intersection closest to the origin. 

Let the vector from the origin 0 to this point N be desig- 
nated by ON (see Fig. 3.1). ON is a linear combination of 
the vectors a and b 



ON = y^a + Y2^ 




(3 . 9d) 



where y^ and Y 2 are scalars, which are determined from the 
condition that ON has to satisfy (3.9b) and (3.9c) as the 
position vector r. 

Thus 



(y 1- + y 2- ) ’ a = c i 



(Yi a + Y2^) • b - c 

and 





~a • a 


a- b _ 


1 

i — 1 

u 

1 

1 — 1 
1 




a*b 


b • b 


L c 2 j 



( 3 . 9e) 
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LINE OF INTERSECTION 
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Equation (3.9d) may be written in component form using 
(3.9e) as 



or 



X N 




a l 


b l 




y 2 

Z a . 

i=l 1 


3 

E 

i=l 


a . b . 

l l 




— 


a 2 


b 2 




3 


3 


o 


_ Z N 




_ a 3 


b 3 




E a . b . 
i=l 1 1 


E 

i=l 


b . 

l 


V 




_a l 


b l” 




r 


a l a 2 


a 3 




a i 


^N 


= 


a 2 


b 2 






CM 

rQ 

1 1 

I 


b 3j 




a 2 


_ Z N_ 




_ a 3 


b 3_ 












_ a 3 



-1 



If the matrix A is defined as in (3.9a) 



A = 



a 2 a 3 



(3.9g) 



-1 



(3.10a) 



(3.10b) 



1 2 3 

the result (3.10a) may be written in the form of (3.4) 



x 



N 



% 




z 



N 




(3.10b) 



+ rn T - 1 

where A = A [AA ] . 

This shows that in this case the pseudo inverse in (3.4) 
results in the minimum-norm solution. 

c. Underdetermined case, m >n , r<n or m<n , r<m 
The solution (3.4) is obtained using either 

, m T T ^ T 

A = A N [N Ml] N (3.14a) 



63 



or 




(3.14b) 



which corresponds to the minimum norm solution of minimum 
square error. Matrices N and M are defined as factors of 
A [1] 



where matrix N is of dimension m x r, and matrix M is of 
dimension r x n. The rows of N are chosen such that they 
constitute a set of base vectors for the column space 
spanned by A. Matrix M is then the transformation of N to 
A. Its dimensions are necessarily r x n. For example, the 
columns of N might be chosen as all the independent columns 
in A. The pseudo inverse is then given as 



because (3.12) satisfies all four equations in definition 1 
as indicated below: 



A = N»M 



(3.11) 



■f rn T-l T-lT 

A = M [MM ] [N N] N 



(3.12) 



■f rn T— 1 T-lT 

(1) A A = M [MM ] [N N] N NM 

T T — 1 
= M [MM ] M 




+ rp rr—lrp—lrn 

(2) AA = NM M [MM ] [N N] N 
T —IT 

= N[N N] N 



T, -1 r „T. 



(3) AA + A = NM M T [MM T ] 1 [N T N] VnM 



NM 



A 
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+ + T T — 1 T — 1 T T T— IT T 

(4) A AA = M [MM ] [N N] N NMM [MM ] [N N] N 

= m t [mm t ] _1 [n t n] _1 n t 

= A + 

Expression (3.12) which involves two matrix inversions may 
be simplified further to expressions involving only one 
matrix inversion. Since both matrices M and N have rank r, 
(3.11) can be solved for either one 

M = [N T N]~ 1 N T A (3.13a) 

N = AM T [MM T ]“ 1 (3.13b) 

Substitution of (3.13a) into (3.12) yields 

A + = A T N [N T AA T N] -1 N T (3.14a) 

and substitution of (3.13b) into (3.12) results in 

A + = M T [MA T AM T ] -1 MA T (3.14b) 

All expressions, (3.12) , (3.14a) , and (3.14b) , are valid 

general expressions for the pseudo inverse. However, if 
the matrices involved in actual computation are to be as 
small in dimensions as possible, (3.14a) and (3.14b) should 
be used as follows: 

( 1 ) m<n , r >m 

A + = A T N [N T AA T N] -1 N T 

( 2 ) m>n , r <_ n 

4 - np T T —1 T 

A = M [MA AM ] MA 



65 



B. RECURSIVE ALGORITHM FOR THE SEQUENTIAL LEAST-SQUARE FIT 
In Chapter I a recursive relationship for sequential 
estimation based on the equation 



b = Ax 



(3.15) 



and its least-square-error solution 

/s rn _ -I m 

x = [A A] A b 



(3.16) 



is given. However this recursive form is only valid if the 

T T — 1 

matrix [A A] is nonsingular such that its inverse P = [A A] 

exists . 

Consider now the case where no assurance as to the 
existence of the inverse can be given. Using the pseudo 
inverse it is possible to write formally 



or 



x = 



A + b 



A + [AA + ] T b 
A + A T+ A T b 
[A T A] + A T b 



x = 



T 

P A b 



(3.17a) 



T 

where P is the pseudo inverse of [A A] . The dimensions of 
the matrix P are always n x n independent of m or r. This 
is significant because in a sequential procedure both m and 
r increase as more data are incorporated. Therefore, the 
use of an algorithm updating the matrix A + , as proposed by 
T.N.E. Greyville [22], may not be practical for sequential 
estimation because the dimension, m, of A + grows at each 
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step. Alternate methods [7,8] using an updating procedure 
for A until the matrix A has dimension n x n are considered 
below: 

(1) Direct updating of A + . 

At each step, as additional measurements are incorpora- 
ted, the size of A + grows one column for each step. C. H. 
Wells [7,8] presents an algorithm for the updating of A + . 
However his procedure has the disadvantage that when initi- 
ating an estimation problem the rank of A must increase at 
each step until maximum rank is reached. This is not the 
case if the first n equations in (3.15) are dependent. 

(2) Updating A + using (3.8) as long as m <n . 

A recursive algorithm or direct computation based on 

T 

(3.2) is possible only if the square matrix AA , of growing 
dimensions, remains non-singular. Thus the rank of A has 
to increase at each step which may not be true. Further- 
more, a recursive form could be used only as a starting 

T 

procedure up until the matrix AA has dimensions n x n. 

Consequently, it is desirable to find a recursive 

formulation similar to (2.26), where all matrices involved 

have constant dimensions regardless of rank or size of A. 

This result is accomplished here with a recursive form for 

T 

the matrix P, where P is the pseudo inverse of A A. 

In order to derive this recursive algorithm, consider 

(3.18a) 



the set of equations 
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where the subscript k denotes the number of equations. 
Assume the solution of the form (3.17a) 



A 




r . T- , + T 
[A, A, ] A. 2. 
k k k — k 



T 

P, A, z, 
k k — k 



(3.19a) 



. T ~ 

where is the pseudo inverse of A^A^. and x^ the best 
approximate estimate for x based upon the last k equations. 
Then at the next step (3.18a) takes the form 



■ z k ~ 


= A , x = 


1 

< 1 
I 1 


- z k+l~ 


k+1- 


T 

La J 



(3.18b) 



where z. , . is a new scalar measurement and a is a row 
k+1 — 

vector of coefficients relating the observation z k+1 to x. 
The solution to (3.18b) is then 

(3.19b) 

In order to find an alternate expression for (3.19b) let 



x = P A^ z 
-k+1 k+1 k+ 1 -k+1 



x = x k + Ax 



Substitute (3.20) into (3.18b) and premultiply with A^ + ^ 
to obtain 



(3.20) 
T 



k+1 



"k+1 



= A k+1 A k+1 ( ^k + A ^> ' 



or 



A k-k + z k+l— = [A k A k + ( ^k + A - J 
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But 



T + 

A. A. = P. 

k k k 



Thus 

A^z, - P*x, + (z, , , - a T x.)a = [P, + + aa T ]Ax (3.21a) 

k— k k— k k + 1 — — 1 — k — 

T 4 . ^ 

The term A^z^ - P^Xj, can b e shown to equal the null vector 
as follows. According to the defining equations of the 
pseudo inverse, (3.5), 



_T . T^T+ T 

A k \ A k A k 



- [A kV Tft k 



+ T 
= A k\ A k 



+ + T 

= A, [A, A, ] A, A, 

k k k k k 



_ +^+T-T 7 . t 

— A, A, A, A, A, 

k k k k k 



T + T T + T 

“ i\ A kJ ‘\ A k 1A k - p k p k A k 



(3.22) 



Also since P, - P? and P* = [P+] T 

k k k k J 



P,P* 

k k 



+ T 

i p kV 



= p t fT p p 

k k 



= p'p 

k k 



(3.23) 



Then using (3.22) and (3.23), 



T + - T + T 

A, z, - P. x, = A. z, - P, P, A. z, 

k— k k— k k— k k k k— k 
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(3.24) 



- 11 - p k p k lA iA 



- 11 - p k p kJ A k2k 



- 11 - p k p k )p k p £&k - ° 



Then Eq. (3.21a) reduces to 



( z k+l " - T ^k^ - = ^ P k + -- T -* A -* 



(3.2 lb) 



The solution of (3.21b) for Ax is obtained as follows. 



(1) If has rank r < n and [I - P^P^] £ ¥■ 0_ 
the solution 



Ax d) . [I - p k p k» a . t 

a T [I-P*P k )a k+1 " ' k 



(3.25) 



satisfies (2.21b) by inspection. Ax^^ is not defined if 
either is of rank n, which implies that [I-P^P^] = 0 , or 
if [i-pj^P k ] a = 0. 



(2) If P^ has rank r = n, or if [I-P^P^Ja = 0_ 
the solution is given by 



a - <2> = 



(3.26) 



Substitution of (3.26) into (3.21b) yields 



[P k +aa T ] Ax ( 2) = - k+1 T {a- (a T P k a)a- [I-p£p ]a} (3.27) 

1+a P. a 
— k— 
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n , or 



Since either [I-P^P^] = 0 when P^ has rank r = 

[I-P^P^la = 0 (3.27) reduces to (3.21b). Using the above 
two possible solutions for Ax, which according to the 
conditions (1) and (2) are mutually exclusive, recursive 
forms for the solution of (3.18a) are established. 



(1) Recursive relation if P^ has rank r =n , or if 

U-pJPrU - o 

From (3.20) and (3.26), 



— k+X " 2k + 



p k* 



T 

1+a P. a 
— k— 



<z k+l - 5 5k> 



(3.28a) 



Combining (3.28a) and (3.19b) determines the updating pro- 
cedure for the matrix P, . : 

k+1 



Since 



~ T P k— — T 

— k+1 " P k A k-k “ . T d P k A k-k + z k+l 

1+a P. a 
— k— 



P k^ 



T 

1+a P. a 
— k— 



T 

= [P. £-= — -] ATz. + z, . , 

k , , T_ k— k k+1 

1+a P, a 
— k— 



P k^ 



T 

1+a P, a 
— k— 



P k^^ Tp k P k— 

lP k - tS-tt 1 i - tA 



1+a P, a 
— k— 



1+a P, a 
— k— 



Also 



k+i - 1P k - k t k ) [ftT 5k + "k+i^ 1 

^ lr n cl 

— k— 



*k+l = p k+l [A k2k + z k+l^ 
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Thus , 



P k+1 P k 



T 

P, a a P, 
k — k 

T 

1+a P. a 
— k— 



(3.28b) 



The new matrix satisfies the defining equations for the 
pseudo inverse (3.5). Thus (3.28a) and(3.28b) constitute 
the required recursive form 



— k+1 



k+1 



*k + 



P k " 



P k* 



T 

1+a P, a 

— k— 

T 

P. aa P. 
k — k 

T 

1+a P, a 

— k— 



(2 



k+1 



T- . 

a xp 



(3.28) 



(2) Recursive relation if P^ has rank r<n and 
[I-P+P^] a ^ 0. This condition excludes the solution (3.26), 
which does not satisfy (3.21b). Then from (3.20) and (3.25) 



2k+l = £k + 



- T ^ I_P k P k^- 



, T~ , 

(z k+l - £ — k 



(3.29a) 



For notational convenience define 



^k+l 



^ T(I - p k p k)i 



(3.30) 



and note that 2.] c+ 2 ^ as the f°H° w i n 9 properties 



T T _ . 

- a k+l —k+1— 1 

P k2-k+l = P k2k+1 = - 

m T + T 

2k+l P k = 2k+l P k = 9 . 



(3.31) 
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The desired updating procedure for P k+ ^ 
bining (3.20) and (3.25) 



is found by com- 



4+1 “ + a k+ i U k+1 - a T x k ) 

T T T 

P k A k-k " 2.k+l- P k A k— k + Z k+A+1 
tP k —k+1— P k ]A k-k + Z k+1— k+1 



Also , 



-k+1 [P k+l lA -k + z k+l P k+l— 



Then P k+ ^ must satisfy the following conditions 

(a > p k+ i ■ p k+ i 

T , 

since + ^ is a symetnc matrix. 



(b > p k+l A k - lp k - 2 k+1 S T P k ]A P 



(3.32a) 



(c) P k+1 2 = 2 k+ l 



(3.32b) 

(3.32c) 



A possible solution satisfying the above conditions is 



P k+1 * p k - 2 k +i— Tp k - p k — 2k+i + < 1+ £ Tp k£>2 k+1 2 k+1 

(3.29b) 

Assuming symmetry of , Eq . (3.29b) satisfies (3.32a) 

by inspection. Using (3.22) and (3.31), Eq. (3.29b) can 
be shown to satisfy (3.32b): 



T 

p k+ i A k 



= [p k -2 k+1 a Tp k J A k - 



T T 
P, a g, , A, 
k— ^k+l k 



(H-a T P k a)a k+1 ^ +1 A^ 



Since 



T *T 
— k+ l A k 



T + T 

g. . , P.P, +A7. 
— k+1 k k k 



and 



2k+l P k 



T 

o 1 , 



then 



m mm 

P, X1 A, = [I-g. , , a P, ]A. 
k+1 k -^k+l— k J k 



The last condition (3.32c) is also satisfied by (3.29b) since 

p k+l- = p ka-2 k+ i£ Tp k a- p k £2^ + i£ + < 1+ 2 Tp k = >2 k+ i2j +1 £ 

= p k 2 ' <2 Tp k 2>2 k+1 - p k £ + <l + 2 Tp k 2>2 k+1 



Hence 



P k+1— ^k+i 



In order to prove that (3.29b) is indeed the correct 

T + 

and unique expression for the pseudo inverse [Aj < . + ^A^. + ^ ] , 

has to satisfy the defining equations for the pseudo 
inverse, (3.5). Proof that the equations in definition 2 
are satisfied follows: 

Proof Using (3.31) and (3.23), 

( 1 ) 

P k+l P k+l = [P k“ a k+l- P k“ P k-2k+l + (1+ - P k- )a k+l2k+l ] [P k + -- ] 

= P k P k - 2 k+ A< + WiA k *>a T 

" P k ££k+1^^ T+ ^ 1+ — Tp k — ) —k+1— k+1— — T 



P k P k 9-k+l- P k P k + ^k+l- 



(3.32a) 
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(3.32b) 



[I-P. P* ] a a T [I-P. P* ] 

p p + = p P + + k k k k 

k+1 k+1 k k T fT „ „T, 



a [I-P k P k ]a 



Thus P k+ ]_ p k+ ]_ symmetric and 



“I - T 

p p = rp P l 

k+1 k+1 1 k+1 k+l J 



< 2 > p k + i p k + i = tP k + i p k + i lT 



This follows since p k+ ]_/ p k +i p k +i are s Y mnie ' t:r i c< 



(3) Using (3.32b) 



P + P P + = P + fP P + 1 
k+1 k+1 k+1 k+1 1 k+1 k+1 J 






a [I-P k P k J a 



- p >k p k + ^Vk + 



+ T 

= P| + aa 
k 



5 + p p + = p + 

k+1 k+1 k+1 k+1 ‘ 



( 4 ) Using ( 3 . 32a) 



P k+l P k+l P k+l [P k+l P k+l ]P k+l 



“ < ! P k P k + 2-k+l- [I P k P k ] | [P k ^ k+ i^. p k p k £^+i 

+ (l+a T P k a)g k+1 gJ +1 ] 



-P k P k P k %k+l- 2 k +l- p k P k P k P k -2. k +l + (1+ - p k -*2k+l- Sk+iSk+i 



75 



P k "2 k+ i£ p k " P k-%+l + (1+ £ p k £)2 k+ i2 k+ i 
P k+l P k+l P k+l = P k+1 * 



This concludes the proof and (3.29b) is indeed the desired 
updating procedure for the matrix P^. 

To complete the recursive algorithm for the solution of 
(3.18) a recursive form for the matrix = [I-P^P*] has to 
be established. From (3.32a) 

P k+ 1 P K +1 ■ p k p k + a k+1 a T [I-P k P k ] 

then 

[I - p k + i p k + i> - i p - p k p k ] - 2k+i- T [I - p k p k' 

and 

R k+1 R k 2 k+ i 1 R k (3.33) 



Note that the matrix R^ remains unchanged if the recursive 
form (3.28) is applicable because, in this case. 



P P 
k+1 k+1 



p k - 



p kS ° Tp k ' 1 

T 

1+a P^a j 



+ T 

P. + a a 
k — 



+ T P k— — P k + 

P.P. + P.a a £-= — — P7^ 

k k k , , T^ k 

1+a P. a 
— k— 



T 

- P k- T 

tT p k— — T 



1+a P, a 
— k— 



P P + 
k+1 k+1 



+ p k-- 

P P + + — 

k k nj _ T n 

1+a P. a 
— k— 



[I - p k p k ] • 



Since either 

[I-P k p£] =0 or a T (I-P k pJ] = 0 T , 



7 6 



then 



P P + = P P + 

k+rk +1 r k k 



and 



R 



k+1 




Then the complete recursive algorithm, if P^ has rank < n 
and [I-P k P^]a ^ 0_, is given by (3.29a), (3.30), (3.29b), 
and (3.33), which are summarized as 



R k- 

^k+1 Tl 

a R k a 



(a) \ 



T 



< 



Sk+l = Sk + 2 k+ l (z k+l - 2 & 



(b) 



P k+1 P k 2-k+l- P k P k-Sk+1 +(1+ - P k- ) a k+l 2 k+l (c) 



M3. 34) 



T t 



' R k+1 R k 2.k+l- \ 



(d) 



J 



where R k = [I-P k P k ] is an idempotent matrix, the trace of 
which is equal to n-r, where r is the rank of A k and n the 
maximum possible rank of A^. A computation flow diagram for 
the recursive algorithm (3.28) and (3.34) is given in 
Fig. 3.2. Note that if the normalized least-square-error 
solution is desired, only Eq. (3.24c) has to be changed to 



k+1 



“ p k ' 2 k+ iS Tp k - p k*2 k+1 + (a T S.+ a T P k a) 2k+l2 ^ +1 <3.34e) 



The complete algorithm, Eqs . (3.28) and (3.34), is illus- 

trated in the following simple example. 
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LEAST SQUARES FIT RECURSIVE ALGORITHM 
COMPUTATION DIAGRAM 




RETURN RETURN 



FIG. 3.2 
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Example 3.1 Given the four equations below, calculate 



the best approximate estimate for x sequentially. 

1 = [1 0 ] x 

2 = [3 0 ] x 

3 = [1 1 ] x 

4 = [2 1 ] x 

Following the computation diagram in Fig. 3.2 the results 
below are obtained 

a) k = l z^ = l a T =[l 0] 



~1 




1 


0 




0 


0 


0 


• p = 

' F 1 


_0 


0_ 


II 

1 — 1 
• «. 


0 


1_ 



trace = 1 

b) k = 2 z 2 = 2 a T = [3 0] 

trace R x = 1, CRIT = 0, then use (3.28) 



0 

0 



x 2 = x i + 



P l* 



T 

1+a Pa 



1.3 

0 



P = P 
2 1 



T 

P i aa p 1 

m 

1+a P ^a. 



.1 

0 



R 2 R 1 



0 0 
0 1 



trace R 2 = 1, 
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c) 



k 



3 



3 



[1 1 ] 



z„ = 



a = 



trace R^ = 1, CRIT = .5, then use (3.34) 



R„a 

2-3 = "T 

a R 2 ^2 



0 

1 



A 




A 




T 

+ g 3 (z 3 ~a x 2 ) 



1.3 

1.7 



P 



3 



T 

SL3E P 2 



P 2— 2.3 + (l+a Tp 2 a) 2 3 2.3 



.1 -.1 

-.1 1.1 



R 



3 



R, 




0 

0 



trace of R^ = 0 , thus all following equations are processed 
according to Eq. (3.28). 



d) 



A 




k=4 z^ = 4 a T 

- . P k— , T s 

= + = (z .-a x,) 

1+a P. a 
— k— 

.0952 
-.104 



= P 3 - 



P 3— — P 3 
l+a T P,a 



1.29 

1.67 



-.104 
. 714 



The vector x^ is then the best approximate solution to the 
given set of equations. 



The recursive algorithm presented here is more general than 
the one used in Chapter II, because it includes a starting 
procedure. Regardless of the rank of A the estimate 
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obtained is the best approximate solution according to 
definition 1. 

C. ESTIMATING THE STATES OF A LINEAR DYNAMIC SYSTEM 

An alternate interpretation for the above recursive 
algorithm is that of determining the constant state vector 
x for the following system from a series of noise-contam- 
inated scalar measurements: 

System: ^k+l = 1 x k 

Measurement: z k+1 = M k+1 x k+1 + v R+1 

where M^ + i is the time-varying observation matrix 
and Vj, +1 is the measurement noise. 

In order to be able to estimate the state vector for 
a dynamic system, where the transition matrix is 

in general time-varying and not equal to the identity 
matrix, it is desirable to develop an algorithm similar to 
(3.28) and (3.34) for the following systems and the scalar 
measurements z 



System: 


— k+1 $ k+l,k-k 


(3.35a) 


Measurement: 


z k+l = M k+1— k+1 + V k+1 


(3.35b) 



For the case when the system of equations for the estimation 
of x^ is determined or overdetermined (P^ has rank n) Eqs . 
(3.28) are easily adapted to include the transition matrix 
[3]. Let the equation 
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(3.36a) 



Z, = A. X. + V, 

— k k— k k 

be valid at time instant k, then the solution is 


(3.36a) 


/v r T ,-l T 

2k - [ W *iUk 

and 


(3.36b) 


p k ■ i^V' 1 

Using the property of the transition matrix that 


(3.36c) 


[$ k,k-l ] = $ k-l,k ' 


(3.37a) 


Eqs. (3.36) at time instant k+1 take the following 


form 


-k - A k\,k_A+l 

~ ,.T -T* . ,-l.T ,T 

—k+1 ^k , k+l A k A k^k ,k+l ^k,k+l k^k 


(3.37b) 


$ k+l,k-k 

Q k+1 = [ \,k+l A k A k\,k+l ] 


(3.37c) 


T 

= $ P $ 

k+ 1 ,k*k k+ 1 ,k 


(3. 37d) 



Thus whenever the matrix has rank n, the valid recursive 
algorithm for the dynamic case, including the new observa- 
tion is 
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f Q = $ P ^ 

w k+l w k+l,k k k+1 ,k 



’k+1 “ Q k+1 



Q k+l M k+l M k+l Q k+l 

1+M k+l Q k+l M k+l 



(3.38) 



-k+1 



^k+l,k— k + 



Q k+l M k+l 



1+M k+l Q k+l M k+l ) 



Consider now the case when [A^A^] is singular so that the 
pseudo inverse 



p k - t A kV + 



(3.39) 



should be used. At time instant k+1 



T + 

0 = f$ p $ 1 

y k+l 1 k,k+l k k,k+l J 



(3.40) 



A comparison of (3.39) and (3.40) can easily be made using 
(3.14a) . Let 



then 



A, = N • M 
k 



+ T T T - 1 T 

A r -- M^fMA^M 1 ] X MA k 



Also define 



B k ’ A k $ k,k+1 



then 

+ T T t t T - ^ T 

B = U4.1 M [M4>, , .A, A, 4>. , ,M ] M 4> A^ 

k,k+l 1 k,k+l k k k,k-t-l k,k+l k 



Thus (3.39) and (3.40) may be written as 
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T + T ~ 1 
P k = M [MP k M ] M 



T T + T T 

Q, ,, = $, , ,,M [M <J> , , P, M ] M $. . 

k+1 k,k+l 1 k,k-ll k k,k+l k,k+l 



This reveals that there is no simple relationship between 

the matrices and t^,k + l A k A k 4> k,k + l 1 + ' 

Thus (3.34) cannot easily be adapted to yield the best 
approximate estimate for the state vector at k+1. However 
an acceptable alternative for a starting procedure is to 
estimate x^ sequentially using (3.28) and (3.34) until P k 
reaches maximum rank; then (3.38) may be used. The inter- 
mediate estimates when P k has rank r<n are given by 

4 = • 
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IV. FINITE ITERATION METHODS 



In previous chapters methods for the solution of Ax=b 

and for the pseudo inversion in recursive form have been 

presented for the solution of the sequential-estimation 

problem. In this chapter, finite iteration methods for the 

solution of a set of linear equations and for matrix pseudo 

inversion are presented. These methods are based upon an 

infinite sequential error-correcting scheme, proposed by 

J. Nagumo and A. Noda [10], combined with the Gram - Schmidt 

process [9]. The derived methods require only a finite 

number (equal to the rank of A) of iterations. This 

approach has also been considered by L. D. Pyle [23] and 

some of the results presented in this chapter are similar 

to his. For the proper use of Pyle's algorithm it is 

necessary to rearrange the given set of equations whenever 

T 

the constant b^ in the first equation, a^x = b^, is equal 
to zero. Since this may not always be convenient in practice, 
an alternate algorithm is presented in which the computation 
starts unconditionally. Section A presents the basic 
iteration procedure with geometric interpretation. In 
Section B(l) this method is combined with the Gram - Schmidt 
process resulting in a procedure for the solution of a 
consistent set of equations. These results are extended 
in Section B(2) to solve a set of inconsistent equations 
and the solution is shown to be identical to the best 
approximate solution according to Penrose. An alternate 
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method accomplishing the same result is then derived. In 
Section C, the foregoing methods are extended to solve for 
the matrix inverse, when it exists, and for the pseudo 
inverse. In Appendix A these results are applied to the 
iterative solution of a set of non-linear equations. 



A. INFINITE ITERATION PROCEDURE 

Consider the problem of solving the following consistent 
set of equations. The term consistent is used to denote 
the fact that the system of equations is assumed to have 
either an exact solution or a unique locus for the solution. 
Alternatively b is contained within the vector space 
spanned by the column vectors of A. 



Ax = b (4.1) 

T 

Let a^ represent the 1 ' th row of A and b^ the 1 ' th element 
in b. Then (3.1) may be written as 



T 

a-^x 



= b. 



T 

a 2— 



= b. 



T 

a x 

m— 



= b 



m 



(4.2) 



The iteration scheme proposed by J. Nagumo and A. Noda [10] 

solves each equation in (4.2) successively for x by adding 

to each approximation for x a correction of appropriate 

size in the direction normal to the hyper-plane in x-space 

T 

represented by a , x = b^. After solving the m'th equation 
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the process starts over again with the first equation. Let 
the i ' th approximation for x be denoted by x^ and the 
initial estimate for x by x^ = 0 . Then the method may be 
presented as 



x. , . = x A , . + (b. -a. x. , . ) 

1-t-jm — O + jm 1 — 1— O + jm t 

- 1-1 

T —2 

i x~ , = x. , . + (b„-a„x. . . ) — = — 

J — 2 + jm — 1+jm 2 — 2— 1+ jm a T 



\ 



— 2—2 



, ,, T . — m 

x_,_,_ = x_ (b -ax ... ) 



> 



(4.3) 



— m+jm — m-l+jm m — m— m-l+jm T 

\ 3. cl 

\ — m — m / 



Eq. (4.3) may be derived as follows. Consider the i ' th 
equation in (4.2) and assume that the solution has the 
form 



x. = x. . + Ax. (4.4) 

—l -l-l —l 

where x i-i is the solution for x obtained from solving the 
(i-l)st equation. Combining the i 1 th equation of (4.2) 
with (4.4) yields 

(b . - g^x. .) = a?Ax. (4.5) 

l ^ l-i-l' —l —l 

which may be solved for Ax using the best approximate 
solution according to Penrose, namely 

A— i = < b i - a^i-p) • (4 ‘ 6) 

cL . 3 . , 

— 1—1 

Then Ax^ is the minimum norm solution of (4.5) , and 
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a . 

— 1 



(4.7) 



x . = x . . 

—l -l-l 



(b . - a . x . . ) 

l —l—i-l 



T 

a . a . 

—l—i 



This describes the equations in (4.3) before the iteration 
process starts over again with the first equation in (4.2). 
The convergence rate of this iteration scheme, although 
quite rapid at first, decreases asymptotically towards zero 
as the solution is approached. The limit is the exact 
solution if the system is determined, that is, if the rank 
of A is equal to the number of unknown elements in x. How- 
ever, if the system is undetermined, the minimum-norm 
solution, as discussed in Chapter III, is approached since 
each correction Ax^ is in the direction normal to the plane 
described by the i ' th equation in (4.2) 

B. FINITE ITERATION PROCEDURE 

1 . Sets of Consistent Equations 

The foregoing iteration scheme requires an infinite 
number of steps to converge to the solution. If the process 
is truncated, only an approximation is obtained. As will be 
demonstrated this difficulty may be remedied by constraining 
the corrections to be orthogonal to each other. 

Again consider the set of consistent equations (4.2) 
where each vector a^ is normal to the hyper-plane described 
by the i ' th equation in (4.2). These normal vectors are 
generally not orthogonal to each other. However, using the 
Gram-Schmidt process [9], an orthogonal basis for the vector 
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space spanned by the vectors a^ (i = may be 

obtained. According to this procedure a sequence of vectors 
ou is constructed from the set of vectors a^ in the follow- 
ing manner: 



a n = 






-1 


-1 


T 






—1—2 


—2 = 


-2 


T -1 


• 




SLi“i 


• 

• 




T 

k-1 a. a, 
- — l— k 


a, = 


a. 


Z — 


-k 


-k 


. , T 






1=1 a. a 



then the set of vectors a . consists of mutually orthogonal 

—l T 

— i^Jc 

vectors only. Note that — - — a. is the component of a, 

a. a, 1 K 

—i—i 

in the direction of a. , which is subtracted from a, leaving 
the normal component to a ^ . In recursive formulatipn this 
orthogonalization process may be presented as follows. 

Let {C Q fC^ , . . . , C^} be a sequence of n x n matrices with 
C = I, then the set of mutually orthogonal base vectors 
are obtained from 



= c k _^a_ if = 0_, recalculate a^. using a^ +1 



C, = C, *i 
k k-1 



T 

^k“k 



l 1 , 2 ,..., m 

k = 1 ,2 , ... ,r 



( 4 . 9 ) 



If the process yields a zero base vector, the corresponding 
vector a i is a linear combination of the previously defined 
base vectors, and the i ' th equation is a linear combination 
of the previous equations. Therefore the i ' th equation and 
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the corresponding zero base vector may be disregarded, so 
that, finally, there are only r independent equations and r 
orthogonal base vectors, where r is the rank of A in (4.1) . 
The exact solution for x or the minimum-norm solution for 
x, if the system of equations is underdetermined, is 
obtained as a linear combination of these r base vectors. 
Using the form of (4.7) a correction Ax^. is made success- 
ively for each of the r mutually orthogonal base vectors 



T — k 

2k = 2k-l + (b k • 2 k * k -l> -T— 

2k2k 



(4.10) 



It should be noted that the process (4.10) terminates after 
the r corrections are made. Thus the infinite iteration 
process of (4.3) essentially reduces to an r-step process. 
These results are summarized in (4.11) 



^k 



Ci i a . 

k-1 — i 



if a, =0, recalculate a, using a. . \ 
— k — — k 3 — li ' 



C, = C 



k-1 



^A 

T 

-k-k 



, -k 

-k -k-1 + T 

“A 



ip i - 1 , 2 , . . . ,m 

(b k“-i-k-l ) k = 1 , 2 , . . . , r 



) 

(4.11) 



Note however that the first equation which starts the pro- 
cess has to have a nonzero element, b-^, in the vector b, 
because if b^ = 0, x^ = 0_ which is not correct generally. 
If b^ = 0 the process may be started by either choosing 
another suitable equation of (4.2) as the first equation 
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or by initializing x = a., where a. x a. ¥ 0 . That is, 

3 — o —3 — j —1 

a^ is not parallel to a^ . Eqs . (4.11) may be rewritten in 

a compact form as 



2k = 2k- 1 + T 



P, 3. . 
k-l~J 



a .P. . a . 

-3 k- 1-3 



(b . - a . x, . ) 
J -D-k-l' 



T 

P, . a . a .P, i 

p = p k - 1 — ]— ] k -1 

k k -1 T 

a ,P, . a , 

-3 k-l-n 



(4.12) 



k = 1 ,2 , . . . ,r 



where P q = I and the index j denotes the succeeding equation 

T 

in (4.2) for which a. P, a . ^0. It is interesting to 

—3 k — 3 

compare the form (4.12) with the form of the recursive 
least-square-error solution (2.26). The following example 



illustrate 


s the 


use of 


(4 


. 11) 


Example 


4. 


1: 


Use 


Eqs . 


(4 


.11) 


equations 


for 


X. 












1 


0 


2 




X 1 




1 




1 


1 


0 




X 2 


= 


0 




0 


2 


1 




_ x 3 _ 




0 




X 

— 0 


= 


0 


C 

1 


0 


I 





Step 1: 



a-, = C a. - 
—1 o— 1 



*1 = *0 + (b l~— JL^o) 



a . 



T 

a . a . 

— 1— x 



' 0 


1 


1 


r.2 1 


0 


1 


0 


= 0 


0 


3 


2 


l = 4 



C - 
o 



T 

ai“i 

T 

a . a . 

— 1— x 



.8 0 -.4 

0 1 0 
-.4 0 .2 
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Step 2: a 2 = C^a^, = 



. 8 
1 

-.4 



-2 



T —2 

— 1 + ( b 2 - — 2— 1 } ~T 

- 2-2 



.2 


1 


. 8 




1 


0 


X 

77 


1 




-1 


.4 


y 


-.4 




4 



C 2 C 1 



T 

“ 2^2 

T 

— 2—2 



1 

9 



4 -4 -2 

-4 4 2 

-2 2 1 



Step 3: 



“3 



C 2— 3 = 



-10/9 
+10/9 
+ 5/9 



-3 



T —3 

—2 + (b 3~— 3— 2 } 

—3—3 



1 

9 



1 

-1 

4 



1 

’9 



-.8 

.8 

.4 






c„ = 



C 2 - 



T 

—3—3 

T 

—3—3 



0 

0 

0 



0 

0 

0 



is the unique solution to the given set of equations. 
The iteration sequence as well as the planes described by 
the set of equations are shown in Fig. 4.1. 

2 . Sets of Inconsistent Equations 

If the set of equations (4.1) is inconsistent, as 
is usually the case in estimation problems, the vector b 
is not completely contained in the vector space spanned by 
the column vectors of A. A solution may be obtained by 
solving the equation 



Ax = b A 



(4.13 
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where b^ is the part of b contained in the column space of 
A, or the projection of b into the column space of A. The 
remainder b^ = b - b^ represents that portion of b which 
is ignored in the solution, and is orthogonal to the space 
of A. This represents an optimal choice for b A which may 
be shown as follows: 

Let the vector b be decomposed into two components 

b = b^ + b E (4.14) 

where b^ is contained in the space of A and b E is the part 
of b which is ignored in the solution process. If is 

the solution of the consistent set of equations 



^A = A *k 



then Hb-AXjJI = || b^ - Ax k + b E 



I — E 



^ ij> 2 

where ||b E || = T. (b^^ - a-x^) 



The minimum of || b E || is obtained when b E = b^. Then the 
solution for x of the inconsistent set (4.13) is, by 
definition, the least-square-error solution. The standard 
way of achieving this projection of b is to premultiply 
(4.2) by A T . 



T T 

A Ax = A b 



,T, + A b M 

= A b A — N 
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but since 




0 



T T 

A Ax = A b 



-A 



(4.15) 



Since (4.15) is a set of consistent equations, it may be 
solved using (4.11). 

Another finite-step process for the solution of (4.1) 
which starts unconditionally but involves a little more 
computation may be derived following arguments similar to 
those which led to (4.11). Again consider the equations in 
(4.2). In order to obtain the minimum-norm solution (if 
the set is underdetermined) the desired solution may be 
represented, as in the previous section, as a linear com- 
bination of the row vectors of matrix A, or, equivalently, 
as a weighted sum of different base vectors representing the 
same space. The set of Eqs . (4.1) , if it is assumed to be 

inconsistent, may be written 



where the Aa^. vectors are constructed to be mutually per- 
pendicular and A^Aa^ is the part of b parallel to the vector 
Aoij,. Thus the coefficients A^, are determined from the 
condition 



b = Ax + b 



-N 



(4.16) 



where again b. 7 is orthogonal to the space of A 

— £sl 

Now let 



b = A^Aa-^ + \2&SL2 + 




(4.17) 
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0 



(4.18) 



( A £ k ) (b - o^Ax^ 



as 



T„T. 
a. A b 
— k — 

T T 

OL A Ad, 

— k — k 



(4.19) 



Substituting (4.19) into (4.17) yields 



T^T, T T 

a^A b a^A b 

b — iji ip Aa^ t ... + ^ ip Aa_ r + b^j 



a^A Aa^ 



a A Aa 

— r — r 



= A 



* T T 

r a^A b 

E T T -k 
k=l a^A Aa^ 



+ b. _ 
— N 



(4.20) 



Comparison with (4.16) yields the desired solution, namely, 



T T 

r o. A b 
~ v — k — 

X = 2 m 2L k 

k=l a. A Aa. 

— k — k 



(4.21) 



The mutually perpendicular vectors Aa^. are constructed using 
the Gram-Schmidt process. Using (4.9) with replaced by 
Aa^ and a^ replaced by Aa^ , yields 



A 2k 




= C, . Aa. if Aa, =0 , recalculate Aa. using Aa..ri 
k-1 —l ~k — — k —l+l 



■ C 



k-1 



T T 

Aa, a, A 
— k— k 

T T 

a, A Aa, 
— k — k 



i = 1 , 2 , . . . ,m 
k ■ 1,2,. ..,r 



(4.22) 



Another equally acceptable set of mutually orthogonal 
base vectors for the column space of A may be obtained from 
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f 



the vectors Ad ^ , where the d ^ ' s are the rows of the matrix 
T 

A A, since the column space of A may be expressed using the 
set of all the vectors Ad^ , as well as the set of all 
vectors Aa ^ . Using Ad^ instead of Aa^ in (4.22) requires 
less computation if the dimensions of A are such that 
m»n. Thus it is possible to write, starting with C Q = 0. 



A3 k = C. 



, , Ad . 

k-1 — 1 



if AB_ k =0_, recalculate AB_ k using AcL + ^ 



C k = 



'k-1 



T T 

A6 k 6 k A 
T T 

V AB k 



i 1 , 2 , . . . , n 

k = 1 , 2 , . . . , r 



(4.23) 



where B_ k replaces u k in (4.17) and (4.21). 

In order to obtain the solution (4.21), Eq. (4.23) is modi- 
fied to yield an explicit form for the calculation of the 
^ k ’ s , which when multiplied by A yield the orthogonal base 
vectors for the column space of A. 



'k-1 



= I - 



T T 

ABjijA 
T T 

il A A £i 



T T 

A ik-A-! A 

T T 

4-i A A %-i 



= i - 



k-1 

E 



T T 

A Ii£i A 



T T 

i=l BjA A6„ 



then 



T T 

k=l AB.bJa 1 

A £v = [I - 2 rp m 1 Ad. 

K i=l Bj;A i AB i 



T T 

k-1 6_. 6 A A 

= A [I - £ -jp-i 1 d. 

i=l B. A AB, 

— 1 " JU 
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\ 

\ 

' \ 

and 

d. 

—l 

Combining (4.24), written in recursive form, with (4.21) 

yields the desired finite step process for the solution of 

(4.1) . Let C = I then 
o 



s* - 



k-1 6 i 6iA I A 

^ t T 
i=l 6 A X AB. 



ik 



= C n 



,_^d^ if = 0, recalculate using d. 



•i+1 



\ 



C, = C 



k-1 



T T 

44 A A 

T T 

— k A A ^k 



T T 

B^Vb 

/v /v — K — 

-k = -k-1 + _ T~T 

V. A % 






i = 1 , 2 , . . . ,n 
k = 1,2 , . . . ,r 



J 

(4.25) 



The process is completed when k = r. £ r is the desired 
solution . 



C. MATRIX PSEUDO INVERSION 

The foregoing computation methods may be extended to 
yield matrix inversion or pseudo inversion. However no com- 
putation time comparison with already existing methods has 
been made. Consider the solution of the matrix equation 

AX = I (4.26) 

where the matrix A is square of dimensions n x n and non- 
singular. The inverse of A may be obtained as the solution 
of (4.26) from (4.12). Let P « I and X = A T then 
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x = X + Pk ~ 1 ~ k (i T - a T X > 
X k X k-1 T d — k -k X k-1 

a, P. . a, 

— k k-1— k 



(4.27) 



P. = P. n + 
k k-1 



T 

p k - A ^ p k - 1 

-k p k-i-k 



k = 1,2, 



,n 



. T . 

where 1 ^ is the i ' th row of the identity matrix and the 

n ' th approximation for X, X n is the desired inverse. The 
. . T 

initial condition X = A is chosen to ensure that the 

o 

starting conditions for (4.12) are satisfied, since 
a 2 , a^, ..., a n are, by virtue of the nonsingularity of A, 

not parallel to a ^ . 

The pseudo inverse may be obtained as follows. 

Let C = I and X rt = 0 , then 
o 0 ' 



6, = C, .d. if 6 =0, recalculate 6, using d..?^ 

— k k-1— l — k — — k 3 —l+l 



C, = C, , 
k k-1 



T 

— k— k T 

k k - a l a 



T T 

— k A A «k 






x k = X k-1 + ~ T T 

k K 67A A6, 

— k — k 



k — 1^2/# • • fk 



y(4.28) 



T T 

where d. denotes the i ' th row of A A, and X^. is the solu= 
—l r 

tion for the pseudo inverse of A. Proof of (4.28) follows. 
From (4.28) it is evident that 



T T 

6 .A AS . = 0 

-l -: 



for i ^ j 



(4.29) 
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since the vectors A6_^ (i = l,2,...,r) are orthogonal. Also 
T 

A A may be expressed as a matrix whose rows are linear com- 
binations of the set of vectors . Thus 



A T A = [6J 6 2 | . . . \§_ r ] M 



(4.30) 



The expression for C r from (4.28) may be rewritten as 



C = 
r 



I - 



r 6_.6?A T A 
£ 11 



i=l 



T T 

6_!a A6^ 



= I - 



r 

Z 



T 

iiil 



T T 
_i=l 6TA A6. 



T 

A A 



(4.31) 



where the summation is identified as Y 

T 

r iiii 

Y = E T V (4.32) 

i=l 57 A 1 A^ 

so that 

C = I - YA T A (4.33) 

Using (4.29) it follows that 

C r 6_ i = 0_ (4.34) 

and thus 

C r A T A =0 (4.35) 

and C^Y = 0 (4.36a) 
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or 



Y 



T 

YA AY, 



(4.36b) 



• T 

Using (4.36b) to expand the product A AY yields 

T T T 

A AY = A AYA AY 

or [A T A - A T AYA T A]Y = 0. (4.37a) 

Since Y ^ 0 this yields 

A T A = A T AY A T A (4.37b) 

From (4.35) 

T T T 

A A - YA A A A = 0 

and then also [A T A - YA T A A T A]Y = 0. 

This is combined with (4.37a) to yield 

[A T AY - YA T A] A T AY =0. (4.38a) 

Since A T AY ^ 0 , then 

A T AY = YA T A. (4.38b) 

Eqs . (4.36b), (4.37b) and (4.38b) by definition establish Y 

T 

as the pseudo inverse of A A: 

Y = [A T A] + (4.39) 

Comparing (4.32) with the last equation in (4.28) when 
k = r yields 

X r = YA T = [A T A] + A T = A + (4.40) 
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Penrose [5] has also suggested a recursive method for 
computing the pseudo-inverse. Let 

C 1 " 1 

then 

C k+1 = 1 * k trace (C k A T A) “ c k ATA (4.41) 

T . T 

The product C r+ ^ A A = 0, where r is the rank of A A, 

T 

then the pseudo inverse of A A is 

[A T A] + = 7p— C (4.42) 

trace C r A A 

The proof is given in Ref. 5. It should be noted that the 

Penrose method involves at least one matrix multiplication 

3 3 

for each step, or approximately rn operations, where n 

operations are required to perform the multiplication of 

two n x n matrices, whereas the method of (4.28) requires 

2 

approximately 5rn operations to obtain the same result, 

T + 

namely [A A] . 



102 



V. RECURSIVE ALGORITHM FOR THE 



SLIDING-WINDOW OBSERVER 

Since the work of Luenberger [11] , deterministic linear 
observation systems, called observers, have been recognized 
as practical alternatives to statistical optimum linear 
filters when efficient and fast real-time estimation of 
the system states is desired. Avoiding problems associated 
with the estimation of a priori statistics, the observer 
simply solves the estimation problem as a deterministic one 
and disregards statistical quantities. The simplest formu- 
lation for the observer is the sampled-data type which 
accepts the measurements or observations only at discrete 
points in time. 

Consider the sampled-data system 



-k $ k,k-l — k-1 



z, = M, x, 
k k — k 



(5.1) 



where is the system state vector at time instant k, 

$k j <; _^ is the general time-varying transition matrix from 
time (k-l)T to kT. is the time-varying observation 

matrix of dimension 1 x n, and z ^ is the scalar observation. 
As in other chapters , only the case of scalar observations 
is considered here in order to obtain results without time 
consuming matrix inversions. The observer for the system 
(5.1) is given as 
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(5.2) 



x. = $. , , x, , + g, ( z, -M, $. , , x, . ) 

— k k , k-1— k-1 ^-k k k k , k-1— k-1 

where x^ is the estimate at time kT. 

This observer equation may be rewritten as 

-k = F k— k-1 + 2k z k (5.3a) 

where 

F k ■ t 1 - 2A]* k ,w (5 -3b) 

The matrix is called the observer transition matrix. 

Bona [12] has shown that the eigenvalues of , which are 
dependent on the choice of g^., determine the performance 
of the observer in processing noise-contaminated observa- 
tions. The time-varying gain, g^, for the optimal filter 
is determined such that the trace of the error covariance 
matrix is minimized. For a specific time-invariant 
observable system [3], Bona [12] has demonstrated that 
constant gain observer with eigenvalues of approximately 
0.5 approach the performance of the Kalman filter. As an 
observer design rule for time -invariant systems, he suggests 
the choice of the largest eigenvalue X L of F so that 

o £ 

(A t ) is approximately zero, with the result that (F) is 

approximately zero thereby limiting the memory of the 
observer to approximately the last £ observations. Because 
of the size limitations of the memory, this type of observer 
is also referred to as a sliding-window observer. 
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In the following work, stimulated by a list of un- 
solved observer problems [13] , a recursive relationship 
for the sliding-window of minimal length for the general 
system of Eq. (5.1) is developed in Section A. In Section 
B these results are extended for the time-invariant system 
to yield the recursive form for a sliding-window of speci- 
fied length. The approach to the solution is quite 
different from the one discussed by Bona, resulting in a 
new design rule for sliding-window observer of exact 
specified memory length. Finally, the results obtained are 
illustrated by an example. 

A. THE MINIMUM-WINDOW OBSERVER 

The minimum-window observer is the fastest linear 
observer possible in that it determines the states of a 
system from the necessary minimum number of observations. 

The eigenvalues of the observer transition matrix are all 
zero [12]. The desired recursive form is derived as follows. 
Consider system (5.1) and its observer equation (5.3). 

Suppose the system is of order n. At each instant of time 
k the state vector, x^., is determined from the last n 
observations. Thus 



x. (5.4a) 

— k 



Z k-n+l 




M, 

k-n+1 


$ 

k-n+1 ,k 


z k-n+2 




M. , -i 

k-n+1 

e 


$ 

k-n+2 ,k 


« 

• 

z k-l 




• 

M k-i 


^k-1 ,k 


_ z k 




M k 
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or 

z^. = R (k) x k (5.4b) 

where the matrix R(k) relates the n last observations z_ k to 
the state vector x^. The observability condition for this 
system is then that R(k) is nonsingular for all k. Because 
of this it is not necessary that each of the rows of R(k) 
be the product of an observable pair. The solution to 
(5.4b) is then trivial 

x k = R(k) _1 z k (5.5) 

where x k denotes the output of the observer. Let the 
matrix R -1 (k) = C(k) be partitioned into column vectors 



C (k) = 



c. (k) 1 c 0 (k) 1 ... 1 c (k) 

— 1 i — z i i — n 



(5.6) 



Then (5.5) may be written as 



-k “ -l (k)z k-n+l + -2 (k)z k-n+2 +,,,+ -n-l (k)z k-l + -n (k)z k (5,7) 



Expanding Eq. (5.3) yields 



*k “ F k*k-1 + 2 k ^ k 



= F k F k-l— k-2 + F k2 k -i z k -i + 2 k z k 

= F k F k-l‘ ' * F k-n+l— k-n+1 + F k F k-l* * * F k-n+22 k - n +l z k-n+l 

+ F k F k-l* • ,F k-n+32 k - n +2 ' z k-n+2 

+ F k2k-l z k-l 

+a k z k (5.8) 
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Comparing (5.7) and (5.8) on a term by term basis leads to 
the following conclusions 



(1) F k F k-l’ * * F k-n+l “ 0 



(2) in = ^n (k) 



(3) £j (k) = F k c. +1 (k-l) j = 1 , 2 , . . . , n-1 



so that 



(k) 



F k— 2 (k-1) I * ‘ * | F k-n (k-1) | C n (k) 



(5.9a) 

(5.9b) 

(5.9c) 



( 5 . 9d) 



The recursive relation for the gain vector £ k = £ n (k) may 
be determined from (5.9a) using (5.9c) repeatedly. Thus 

F k [F k-l--- F k-n+l2k-n 1 = F k Si'k- 1 ) “ 0 f 5 - 1 

since 



F, = [I - g, M. ] , , 

k ^-k k k-l,k 



(5.10) may be rewritten as 



Vl.kHl^- 11 “ 2 k \ * k -l,k 



(5.11) 



The desired result is then obtained directly from (5.11) 

-1 



2k = £ n (k) = \_i fk £i'~ - - k "k-l,k -1 



(k-1) [M, $. 



c, (k-1) ) 



(5.12a) 



Note that for scalar observation Eq. (5.12) may be also 
written as 



% = 



M k $ k-l,k-l (k_1) 



(5.12b) 
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For scalar observation the desired algorithm is therefore 
given by Eqs . (5.12b), (5.9b), (5.9c) and (5.3). This is 
summarized in (5.13): 




A 



J F k [I 2fc M k ] \-l,k 






(5.13) 



£j (k) = F k £j + 1 (k-l), j = 1,2 



/ • • • r 



n-l 





J 



This completes the derivation of the recursive algorithm for 
the minimum-window observer. The computation time required 
is almost equal to the computation time required for the 
corresponding optimal filter (3.38). For linear time- 
invariant systems with a time-invariant observation matrix, 
the observer gain and transition matrix are constant. 

Thus (5.13) reduces to the observer equation 



and the required computation time is reduced substantially. 
Although the minimum-window observer is highly sensitive 
to measurement noise, practically excluding the direct use 
of (5.14) for non-deterministic problems, it is nevertheless 
possible to construct acceptable filter schemes from (5.13) 
or (5.14) for special applications. Consider, for example, 



*k = F *k-i + a z k 



(5.14) 
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an oscillatory time-invariant system given by Eq. (5.1) with 
zero damping. Then a simple averager following the minimum- 
window observer is characterized by 



/\ 




F 2k-1 + 



& 





$ 



k-i 



A 




k-1 

k 



* 



*2k-l 



+ 



1 - 
k *k 



(5.15) 



where x^. , the output of the combined filter, results in a 
filter performance almost indistinguishable from the optimum 
filter (3.38). This is demonstrated in Example 5.1. Note 
that the computational requirements are only a fraction of 
the computational requirements for the optimal filter. It 
is the author's opinion that this result merits further in- 
vestigation in order to find some design rules for simple 
and fast observers with almost optimal performance similar 
to the one given in Eq. (5.15). 



Example 5.1 



Let the system equations (5.1) be given as 



x, = $x. i 
— k —k-1 



.9 -.5 

.38 .9 



2k- 1 



z k ^-k + v k [1 0] -k + V J 



(5.16) 



where the observations are now noise-contaminated with the 
measurement noise , a noise sample drawn from a uniform 
distributed noise population of zero mean with maximum 
deviation of ±0.1. Estimate the system state vector x^ 
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using (5.15) when the system starts with ^ 

Results for a typical noise sequence . . . v^,} are shown 

for both state variables xl^ and X2^ in Figs. 5.1 and 5.2. 
Note that the filter response closely matches the actual 
system response. The estimation error defined by 




is compared with the estimation error of the least-square- 
filter in Figs. 5.3 and 5.4. 



B. SLIDING-WINDOW OBSERVER FOR TIME- INVARIANT SYSTEMS 



For the case of time-invariant observable systems with 
a time-invariant observation matrix it is possible to in- 
crease the memory of the observer to an arbitrary length £ 
where £ >_ n. This results in a sliding-window observer of 
length £, where the state vector x^ is determined from the 
last £ observations in the least-square-error sense. Con- 
sider the first set of £ observations according to (5.1). 



Z 1 




M$ 


Z 2 




.. . — £+2 
M$ 


i — 1 

1 

o* 

• • • N 




MO -1 


_ z £ 




M 

L J 



(5.18) 



or 

H “ A U' 



no 



FILTER RE5T0N3E FCR STRTE URRIFinLE XI 




FILTER RESPONSE FCR STRTE URRIfiRLE X2 



! 




FIG. 5.2 EX. 5.1 



+ RC T UPL SYSTEM RESPONSE 

FILTER RESPONSE EQ- (4-15) 
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ESTIMATION ERROR IN STATE UARIRBL r XI 



r 

oc-t> 

(tO-=T 





00 ■? * jO'Z- 00 ’V CO-S- 

t btmtta Noribwiisi 
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FIG. 5.3 EX. 5.1 
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where is the observation vector of the last £ observa- 
tion, x^ is the n dimensional state vector at time instant 
£, and A is the £ x n matrix of constants relating observa- 
tions and state vector x^. Since the system is assumed to 
be observable matrix A has rank r = n. Then the solution 
to (5.18) is given by 

x = PA T z £ (5.19) 

where P = [A T A] ^ 

Eq. (5.21) may be written more explicitly as 



~ ..-£+1, T..T . ,.-£+2,T m T ^ . ,_-l.T„T ...T } 

x^ = P{ ($ ) Mz^ + ($ ) M Z 2 +... + ($ ) Mz 

(5.20) 

At time instant £+1 Eq. (5.18) takes the form of Eq. (5.21) 
since the memory of the filter is limited to £ observations. 



z 

z 

• 

z 

z 



2 

3 



£ 

£+1 



A — £+1 



( 5 . 21 ) 



Then analogous to (5.20), the solution of (5.21) may be 
written as 



«r/*“£+l*T M T , , . -£+2 . T..T , . /A -1.T..T ,.„T •. 

x^ +1 = P{($ ) M z 2 + ($ ) M z 2 + ... + ($ ) M z e +M z e+1 -f 



To obtain a recursive algorithm it is necessary to combine 
(5.22) and (5.20). Thus 

x £ - P(<T £+ 1 ) T M T z 1 = P$ T {(<T £+ 1 ) T M T z 2 +. . . + ($ - 1 ) T M T z} (5.23) 
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-£ + ]_/s 

Substituting = M$ X ^ , which is true for noiseless 

— T rp _ 1 

observations, and premultiplying (5.24) with P(0 ) P 

yields 



P{(0 £+ 1 ) T m T z 2 +. . . + (0 Vm\} 



n ,*-1. Tr -1 /a -A+1. T..T...-A+1, - 

P ( $ ) { P - ( $ )MM$ } x ^ 

P{ (0 ) A AO -(0 ) M MO }0x^ 

n r , ,-l, T r /.-£+l. T,,T«,.-Jl+1 ^ /a -£+2.T..T„ a -A+2 , 
Pi(0 ) [(0 ) M MO + (0 ) M mo + 

+ M M] 0 - (0 ) M MO } Ox^ 



n/ / ■■i.+ l, T T Ji+1 .-T..T.. . — 1-, . ~ 

P{(0 )MM0 +...+0 MMO }0x^ 



r T T . ~ 

= P { A A - M M} Ox^ 

= [I - (PM T )M] Ox^ (5.24) 

T 

Identifying g = PM (5.24) may be written 

P { ( 0 T M T Z 2 +. • • + ($ X ) T M T z^} = [I-gM]0x^ (5.25) 



It is interesting to note that is given by the last column 
of A + , as a generalization of (5.9b) of the previous section. 
Substituting (5.25) into (5.22) yields the desired recursive 
relationship 



*4+1 = U - gM] 0x^ + a z £+1 (5.26) 

where the estimate x^ + ^ is no longer dependent on the ob- 
servation z,. It then follows that the next estimate 
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— 1+2 ^ 1 £+1 + 3.^+2 (5.27) 

is no longer dependent upon the first two observations z^ 
and Z£. Therefore the general recursive formulation for all 
discrete times k > £ is given by 



Sk-i = FS k + a z k+i 



(5.28) 



where F = [I-gM]$ and g remain constant. The complete 
algorithm for the general rectangular sliding-window ob- 
server, with the first state estimation available after 
l observations are processed, is then 



/s n -1 —IT* 

x £ = [A A] A z £ k = l 

4+1 = Fx k + 2. z k+1 k > l 



(5.29) 



The estimation error 



x, = x, - x, 
— k — k — k 



(5.30) 



A 

obeys the same dynamic relation as x, . Thus 



4+i = F 4 + 2y 



k+l 



(5.31) 



If the covariance matrix of x k is denoted by C k it follows 
from (5.31) that 



C k+1 x k+l x k+l FC k F + 2. R 2. 



(5.32) 



where R is the variance of the measurement noise. Let 



e k = |Vc k (l,l) | + |Vc k ( 2 , 2 ) 
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where the sequence of matrices, , is obtained from (5.32), 

th 

R = 1, and C^(i,i) denotes the i diagonal element of 
the covariance matrix. is a relative measure of the 

expected absolute estimation error at time instant k. A 
comparison in terms of this relative absolute estimation 
error for a few sliding-windows of the system in Example 5.1 
with the corresponding error of the least-square-error 
filter is shown in Fig. 5.5. 
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VI . SUMMARY AND RECOMMENDATIONS FOR FURTHER STUDY 



(1) The normalized least-square-error solution of a set 
of inconsistent linear equatons has been established as 
an alternative to the usual least-square-error solution. 

The solution is obtained by minimizing a weighted least- 
square-error criterion and presented in recursive form 
for obtaining sequentially, the solution of a growing set 
of equations. The technique is illustrated by some 
estimation and identification problems. Further investi- 
gation is required to establish a decision criterion to 
determine whether the normalized least-square-error or 
the least-square-error solution method is to be preferred 
in engineering problems. It is expected that this decision 
criterion, in the case of problems involving discrete 
state estimation or parameter identification, will depend 
not only upon the system equations themselves but also upon 
the nature of the measurement noise. 

(2) Complete recursive algorithms for the normalized least- 
square-error solution and the least-square-error solution 
based upon the concept of the best approximate solution [5] 
are presented. 

(3) A finite step algorithm for the calculation of the 
pseudo inverse and the best approximate solution of a fixed 
set of linear equations is proposed. This result has 
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advantage over previously published recursive computa- 
tion methods in that the algorithm presented starts 
unconditionally . 

(4) Finite-memory, linear, observation filters in recur- 
sive form are proposed for the sequential state determin- 
ation of a sampled-data system. Design rules for these 
observation filters, when the system is time-invariant, 
are given: 

(a) for the minimum-window and averager observer where 
the system states as determined from the minimum set of 
past data, are smoothed by a simple averager. This 
procedure is shown to be very efficient for an oscillatory 
system with zero damping. Further investigation is recom- 
mended to improve the estimation for general systems, using 
the minimum-window observer together with a weighted 
averager. It is expected that optimal weighting can be 
approached using a scalar weighting factor in the recursive 
form. 

(b) for the sliding-winding observer of memory length 
A, A > n, where the n states of the system are determined 
from the last A measurements in the least-square-error sense. 

(5) In the Appendix, the recursive algorithm of the best 
approximate solution is applied to the numerical solution 
of a set of nonlinear equations. The results are promising 
in that solutions are obtained when other well known linear- 
ization or gradient techniques fail. The theory behind 
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this approach has not been investigated fully and it is 
recommended that further mathematical work be persued to 
establish rigorous proofs and conditions of convergence. 
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APPENDIX A 



A. ITERATIVE SOLUTION OF A SET OF NONLINEAR EQUATIONS 
The solution of simultaneous nonlinear equations is 
often impossible to obtain analytically, and graphical or 
iterative methods for a computer solution have to be em- 
ployed. In addition, the solution of a class of nonlinear 
differential equations, as discussed in Section B, reduces, 
after integration, to the solution of nonlinear algebraic 
equations at each step of the integration process [16], 

The most commonly used iterative methods are based upon 
linearization techniques, i.e., Newton-Raphson method and 
linear interpolation, or upon some gradient method whereby 
the iterative approximation to the solution is sequentially 
improved such that some error measure is forced to decrease 
[16]. All these methods however may not converge, or they 
may converge to a solution at infinity. In addition, the 
values of the functions as well as their gradients may 
have to be calculated at each iterative approximation. 

This calculation might be impossible if the approximation 
is outside the range or domain of one or more of the 
functions, or if one or more of the functions has a dis- 
continuity at that particular approximation point for the 
solution. Further complications arise from the fact that 
the set of nonlinear equations may have more than one 
solution and the above iteration schemes may converge (if 
they converge at all) to an undesirable solution point. 
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Using the solution methods for a set of linear equa- 
tions, as discussed in Chapter IV, a new iterative pro- 
cedure is developed. This procedure circumvents the problems 
and drawbacks of the foregoing methods, and converges to a 
single point if one or more solution points exist. If this 
point is no solution point the initial estimate has to be 
displaced in the direction of the preferred solution point, 
where the preferred or desired solution is defined (in 
accordance with the concept of the minimum-norm solution) 
as the solution which is closest to the initial estimate. 

The class of nonlinear functions included in the iteration 
process in general are all functions which can be represen- 
ted in polynomial form or which have power-series expansions. 
1 . Development of the Method 

Let a set of n nonlinear functions in n unknowns be 

given as 

= 0 (A. 1) 

Using the polynomial form, or a power-series representation, 
(A.l) may be written as 

Ax + Bg(x) = c (A. 2) 

where A and B are constant matrices of dimensions n x n and 
n x m respectively. The elements of the n x 1 vector 



h l (£) 

h 2 (x) 

h (x) 
n — 
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T 

c = [c-^ ... c ] are the negatives of the constant terms 

in the polynomial or series form. The vector 

T 

g(x) =- tg-L (x) • • -g (x) ] has dimensions mxl. Its elements, 
the functions g^ (X) are the nonlinear remainder terms of the 
corresponding function h^ (x) , or parts thereof, chosen such 
that the functions g^ (x) are defined for all x. 

Range-or-domain-limited functions g^ (x) can be accepted 
only if it is possible to reformulate these functions in 
terms of variables for which they are always defined. As 
an example consider the equation 

y = Jinx =0 (A. 3a) 



which has no real solution for x 0. However the same 
equation may be written as 

- x = 0 (A. 3b) 



or 



U + y 





+ . . . } 



x = 0 



(A. 3c) 



or 



I-i + 1] 



where 



g n (y) 



[;}* 



[1] g, (y ) +1 = 0 



2 
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3 

3! 



(A. 3d) 



(A. 3e) 



Now g^(y) is defined for all y and the domain- limited Eq. 
(A. 3a) is acceptable in the iteration process in the form 
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of Eq . (A. 3d) . This separation of the nonlinear part of h^ 

from the linear and constant part has the advantage that 
the discontinuities of h^ do not appear in g^ . Thus, for 
example, if 



2 yx + x - 1 = 0 



(A. 4a) 



or 



[1 



0 ] 



[2] xy = 1 



(A. 4b) 



y is not defined for x = 0. However the function 



g 2 (x,y) = x.y 

is defined for all x and y. 
Let 



(A. 4c) 



x = x^ + Ax (A. 5) 

where x is the desired solution, x is the initial estimate 
— — o 

for x, and Ax is the necessary correction to x^ in order to 
satisfy Eq. (A.l). Eq. (A. 2) may be rewritten as 

AAx + B[g(x +Ax)-g(x ) ] = c - Ax - Bg(x ) (A. 6a) 

— — — o — — — O — — O — o 

or 

[A + BD (Xq/ Ax)] Ax = c_|_ (A. 6b) 

where c' is a vector of constants defined as the right side 
of Eq. (A. 6a) and the elements of the m x m matrix D are 
defined from the total difference quotients of the functions 
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g^ (x) , (i = Thus 

[Dfx^, Ax)] Ax = g (jc^ + Ax) - g (x^) (A. 6c) 

This may be best explained with an example. Again consider 
Eq. (A. 4c) . Then 



g 2 (x o + Ax, + A„) - g 0 (x^, yj 



2 'o' 

i 



= (x + Ax) (y + Ay) - x y 
o u o 1 o J o 



= X Q Ay + y Q Ax + AxAy 



(A. 7a) 



which may be written as 



I? 



o 



+ 



2 




(A. 7b) 



Then the terms (y Q + ^-) and (x + A^-) are elements of the 

matrix D. In general all the elements in the matrix D 

are dependent on the initial estimate, which is a constant 

vector during the iteration process, and the value of Ax , 

which is unknown. The algorithm proposed sequentially 

updates an estimate for the elements of Ax) until the 

true values are obtained. Then the last approximation for 

Ax is the desired correction for the solution given in 

(A. 5). From (A.l), and the given initial estimate 

calculate A, B and c' as defined above. Let Ax, denote 
' — — k 

th 

the k approximation for Ax and set Ax q = 0. Then solve 
iteratively. 



[A + BD (x^ Ax k ) ] Ax k+1 = c' 



(A. 8) 
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Using (4.25) for AXj ( ^ until (A. 5) satisfies all equations 
in (A.l). The solution is obtained from (4.25) because the 
matrix in (A. 8) may become singular at some step in the 
iteration process. This describes the basic technique, 
whereby the solution is obtianed by iterative approximation 
of the exact difference quotients of the functions g^ (x) 
and not by iterative improvements of a previous approxi- 
mation to the solution. If the process (A. 8) converges and 
the error 

n 

e = I | h. | (A. 9) 

i=l 1 

is acceptably small, the solution (A. 5) usually is the 
desired solution closest to the initial estimate. However, 
if process (A. 8) converges to a value of Ax for which e is 
not small, then x = + Ax lies between two or more 

solutions of (A.l). In this case a displacement of the 
initial estimate is necessary and the iteration has to be 
repeated. As shown in subsequent examples the values for 
the elements of AXj, oscillate in a damped manner about 
their exact value. In order to increase the rate of con- 
vergence and, more importantly, to force convergence if the 
damping of these oscillations is slightly negative (which 
would eventually lead to divergence) additional damping may 
be introduced. This is accomplished by using a weighted 
average between Ax^_;l and Ax^ for the computation of the 

elements of Dfx^, Ax^) . Then solve iteratively starting 

/\ 

with Ax =0 using (4.25) 
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(A. 10) 



( \ 

[A + BD(xo,Ax^.] Ax^ + ^ = c ' 

s > 

A £ k +1 - <4 k + U-“> Aik+l 

where a, 0 <_ a <1, is the coefficient of additional damping, 
a is determined from the rate of convergence of the process 
(A. 10) starting with a = 0. If after a few steps of the 
iteration are completed the convergence rate is considered 
too small, a may be increased and the process reinitiated. 

This completes the development and discussion of the new 
algorithm for the solution of a set of nonlinear equations. 
In order to show the power of the iteration method (A. 10) 
two examples, where other techniques may fail, are given 
below. 

2 . Experimental Results 
Example 1 

Find the point of itersection of the two curves 
y + 2xy + x = 1^1 

2 f (A. 11) 

y-xy + x = 1 J 



closest to the -given point (x Q ,y o ). 

According to (A. 6a) and (A. 6b) this set of equations may be 
written as 



1 1 
.0 1 




r 

0 

CN 

1 


xy- 




~1 


— 1 
1 — 1 

1 — 1 
_L 


1.x 2 _ 




_1_ 



(A. 12a) 
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or 



1 1 
0 1 



2 0 



-1 1 



r , • _y 

y 0 + 2 



1 



2x +Ax 
o 



Axl 
+ ~2 ! 




r A x~ 

L Ay_ 



1 




1 


1 




V 




2 


0 




X o*o~ 


1 




_0 


1_ 




-*o- 




_0 


-1_ 




X 2 

O 



(A. 12b) 



Eq. (A. 12b) is now in the desired form for (A. 10). 
Starting from the initial estimates 




with no additional damping (A. 10) yields the following 
sequences for Ax^. 



-1.0 




.33" 




-.6 




-.1595 


• 

O 
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1 

H 1 

• 

00 

\ 


/ 


_ 2 . 0 


• • • 

/ 


1.703 



.25 




. 3975 




. 3993’ 


2 . 25_ 


f 


_1 . 9 3 


• • • 

f f 


_1. 895 



c) 


. 883 ~ 




.516 




.5993^ 




263 _ 


r 


-.052 


• • • 

/ 


_- . 1051 



Thus the following solutions are obtained 



a) 




_yj 



-. 1595** 
1.703 
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b) 



X 




1.399 


_y_ 




_-. 10 5 _ 


X 




1 . 399~ 


_y_ 




_-. 10 5_ 



The graphs of the functions in (A. 11) and the three solu- 
tions are shown in Fig. A.l. Fig. A. 2 shows the oscilla- 
tion of the sequential values Ax for case (a). Fig. A. 4 
illustrates the dependence of the rate of convergence upon 

the additional damping a where it is assumed that the 

-4 

solution is obtained whenever e < 10 

Note that in case (b) none of the iteration methods that 
depend upon function values could have initiated an iteration 
and that in case (c) other methods would have converged to 
a different solution point. 

Example 2 

Find the point of intersection of the two curves 



y + 2xy + x = 1 
2xy + x = -2 



(A. 13a) 



closest to the point of (x o ,y Q ) 



The solution 



"x* 




- 2/1 


_y* 




+ 3 



, obtained directly from sub- 
stitution is the only finite solution. Thus, no matter 
what the initial estimate is, the desired solution is 



[x* y*]. The set of Eq. (A. 13) may be rewritten as 
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ILLUSTRATION TO EXAMPLE I 




FIG. A.I 
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VALUES OF AX AND Av 



OSCILLATION OF AX AND AY - CASE A 




FIG. A. 2 EX. I 
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NUMBER OF ITERATIONS 



CONVERGENCE AS A FUNCTION OF 
ADDITIONAL DAMPING - EXAMPLE I. 




FIG. A. 3 
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1 1 
1 0 



2 

_2 J 

l" 



.4* 



-2 



L y C 2 



1 1 
1 0 



oX +-y 
O ^ 



n 


p — 

Ax 


J i 


_Ay_ 



^ Y o- 



2 

2 



(A. 13b) 



x y 
cr o 



similar to other iteration methods, (A. 9) will diverge 



whenever the initial estimate x >0. While the other 

o 



■ M 



methods diverge because they iterate towards the solution 

, method (A. 9) diverges because the oscillations 
in Ax become increasingly larger. This situation is reme- 
died by using the method of Eq. (A. 10) with an additional 
damping factor of approximately a = .7. Now the iteration 
for Ax converges rapidly so that the desired solution is 
obtained in only 8 to 9 iteration steps. The solution 
for the first few iteration steps, starting with the 



initial estimate 



[ft] ■ [-‘J 



as well as the graphs of the 



functions defined by (A. 13a), are shown in Fig. A. 4. 



B. SOLUTION OF THE DYNAMIC RESPONSE OF CIRCUITS CONTAINING 
NON-LINEAR RESISTIVE ELEMENTS* 

The exact solution of the dynamic response of a circuit 
containing non-linear resistive elements such as diodes or 
transistors often presents problems, even when the non-linear 
characteristics are known, because it may not be possible 
to solve for the required variables explicitly [6], The 



The material in this section up to Eq . (A. 21) has been 
published in the Proceedings of the Second Annual Princeton 
Conference on Information Sciences and Systems, 1968. 
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ILLUSTRATION TO EXAMPLE 2 




FIG. A .4 
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non-linear resistive elements are considered as dependent 
current (or voltage) sources, dependent upon state or 
other variables in the network. The network is then 
characterized by the following equations. 

= Ax + B ± u s + B 2 i n (A. 14) 

= N f(v n ) - (A. 15) 

= Rx + S ± u s + S 2 i n (A. 16) 

= the state variables 
= independent sources 

= equivalent current sources for the non-linear 
elements 

= corresponding voltages across the non-linear 
elements 

The matrices A, B^ , B 2 , R, S^, and S 2 are determined from 
the network and matrix N is defined from the characteristics 
of the nonlinear network components. The roles of v n and 

i in (A. 14), (A. 15), and (A. 16) are interchanged when 

— n 

dependent voltage sources are used. Eq. (A. 14) is the state 
equation for the linear part of the circuit. Eq. (A. 15) 
formulates the characteristics of the non-linear components 
as given, for example, by the Ebers and Moll [20] equations 
for transistors. Eq. (A. 16) gives the circuit constraints 
(loop equations for equivalent current sources and nodal 
equations for equivalent voltage sources) between the 



x 



l 

— n 



v 

— n 



where x 
u 



l 

— n 



v 

— n 



136 



voltage across the nonlinear elements and their currents, 

the states, and the known sources. 

For a discrete time solution the sources, u and i , 

— s — n 

may be approximated as piecewise linear functions. Thus 
the solution to (A. 14) is given by 



*k+l = *5 k + r 1 B 1 H s <>=) + r2 B i!i s < k+1 ) +r i B 2i n (k)+r 2 B 2i n (k+1> 

(A. 17) 

where 



$ 


AT 

= e 


-1 




(A. 18) 
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a-1 r AT 
= A [e 


A , AT 

- T (e 


- I) ] 


(A. 19) 




-1 








r 2 


= A _1 [^- 


(e - I) - 


I] 


(A. 20) 



Substituting (A. 15) and (A. 17) in (A. 16) yields 



V n (k+1) = R x (k) +R 1 [B x u s (k)+B 2 i n (k) ] 

+ (S-l+R 2 B 1 )u s (k+l) + (s 2 +Rr 2 B 2 )N f[v n (k+l)] 

(A. 21) 

Knowing the values x(k) , u (k+1) , i (k) , Eq. (A. 21) 

s n 

represents n simultaneous non-linear equations which have 
to be solved for v n (k+l). Eq . (A. 21) may be written in the 

form 



A' V (k+1) + B ' 
— : n 


®i 






c i 




# 

• 
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• 

# 




g 


(v ) 




c 






— n 




n 



(A. 22) 
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For diodes or transistors the functions g^(v_ n ) are function 
of one variable only and have the general form 

kv.: 

g i (v i ) = (e 1 - 1) (A. 23) 

Thus the method (A. 10) is directly applicable and yields 
with the estimate v n (k) the solution closest to this point 
v n (k+l) . Using v n (k+l) in (A. 15) and (A. 16) enables i n (k+l) 
to be calculated. Using these as initial values in (A. 21) 
permits the calculation cycle to be reiterated. 
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